Part Number Hot Search : 
SP3724 BA7812 BA7812 FUSB301A 1514145X BU4916G HTE102 SP3724
Product Description
Full Text Search
 

To Download WG82574L Datasheet File

  If you can't view the Datasheet, Please click here to try to view without PDF Reader .  
 
 


  Datasheet File OCR Text:
  order number: 317694-015 revision 2.4 intel? 82574 gbe controller family datasheet product features ? pci express* (pcie*) ? 64-bit address master support for systems using more than 4 gb of physical memory ? programmable host memory receive buffers (256 bytes to 16 kb) ? intelligent interrupt generation features to enhance driver performance ? descriptor ring management hardware for transmit and receive software controlled reset (resets everything except the configuration space) ? message signaled interrupts (msi and msi-x) ? configurable receive and transmit data fifo, programmable in 1 kb increments ? mac ? flow control support compliant with the 802.3x specification ? vlan support compliant with the 802.1q specification ? mac address filters: perfect match unicast ? filters; multicast hash filtering, broadcast filter ? and promiscuous mode ? statistics for management and rmom ? mac loopback ? phy ? compliant with the 1 gb/s ieee 802.3 802.3u 802.3ab specifications ? ieee 802.3ab auto negotiation support ? full duplex operation at 10/100/1000 mb/s ? half duplex at 10/100 mb/s ? auto mdi, mdi-x crossover at all speeds ? high performance ? tcp segmentation capability compatible with large send offloading features ? support up to 256 kb tcp segmentation (tso v2) ? fragmented udp checksum offload for packet reassemble ? ipv4 and ipv6 checksum offload support (receive, transmit, and large send) ? split header support ? receive side scaling (rss) with two hardware receive queues ? 9 kb jumbo frame support ? 40 kb packet buffer size ? manageability ? nc-si for remote management core ? smbus advanced pass through interface ? low power ? magic packet* wake-up enable with unique mac address ? acpi register set and power down functionality supporting d0 andd3 states ? full wake up support (apm and acpi 2.0) ? smart power down at s0 no link and sx no link ? lan disable function ? technology ? 9 mm x 9 mm 64-pin qfn package with exposed pad* ? configurable led operation for customization of led displays ? timesync offload compliant with the 802.1as specification ? wider operating temperature range; -40 c to 85 c (82574it only)
2 legal lines and disclaimers information in this document is provided in connection with in tel? products. no license, express or implied, by estoppel or otherwise, to any intellectual property rights is granted by th is document. except as provided in intel's terms and conditions of sale for such products, intel assumes no liability whatsoever, and intel disclaims any express or implied warranty, relating to sale and/or use of intel products including liability or warranties relating to fitness for a particular purpose, merchantability, or infringement of any patent, copyright or ot her intellectual property right. intel products are not intended for use in medical, life saving, life sustaining, critical contro l or safety systems, or in nuclear facility applications. intel may make changes to specifications and product descriptions at any time, without notice. intel corporation may have patents or pending patent applications, trademarks, copyrights, or other intellectual property right s that relate to the presented subject matter. the furnishing of documents and other materials and information does not provide any license, express or implied, by estoppel or otherwise, to any such patents, trademarks, copyrights, or other intellectual property rights. important - please read before installing or using intel? pre-release products. please review the terms at http://www.intel.com/netcomms/prerelease_terms.htm carefully before using any intel? pre-release product, including any evaluation, development or reference hardware and/or software product (collectively, ?pre-release product?). by using the pre-r elease product, you indicate your acceptance of these terms, which constitute the agreement (the ?agreement?) between you and intel corporation (?i ntel?). in the event that you do not agree with any of these terms and conditions, do not use or install the pre-release product and promptly return it unused to intel. designers must not rely on the absence or characteristics of any features or instructions marked ?reserved? or ?undefined.? int el reserves these for future definition and shall have no respon sibility whatsoever for conf licts or incompatibilities arising from future changes to them. intel processor numbers are not a measure of performance. processor numbers differentiate features within each processor family , not across different processor families. see http://www.intel.com/products/processor_number for details. this product has not been tested with every possible configuration/setting. intel is not responsible for the product?s failure in any configuration/setting, whether tested or untested. the 82574 gbe controller may contain design defects or errors known as errata which may cause the product to deviate from publi shed specifications. current characterized errata are available on request. hyper-threading technology requires a computer system with an intel ? pentium ? 4 processor supporting ht technology and a ht technology enabled chipset, bios and operating system. performance will vary de pending on the specific hardware and software you use. see http://www.intel.com/ products/ht/hyperthreading_more.htm for additional information. contact your local intel sales office or your distributor to obtain the latest specifications and before placing your product o rder. copies of documents which have an ordering number and are referenced in this document, or other intel literature, may be obtain ed from: intel corporation p.o. box 5937 denver, co 80217-9808 or call in north america 1-800-548-4725, europe 44-0-1793-431-155, france 44-0-1793-421-777, germany 44-0-1793-421-333, other c ountries 708- 296-9333. intel and intel logo are trademarks or registered trademarks of intel corporation or its subsidiaries in the united states and other countries. *other names and brands may be claimed as the property of others. copyright ? 2008, intel corporation. all rights reserved.
3 datasheet?82574 gbe controller contents 1.0 introduction ............................................................................................................ 10 1.1 scope .............................................................................................................. 10 1.2 number conventions ......................................................................................... 10 1.3 acronyms......................................................................................................... 11 1.4 reference documents ........................................................................................ 12 1.5 82574 architecture block diagram ....................................................................... 13 1.6 system interface............................................................................................... 13 1.7 features summary ............................................................................................ 13 1.8 product codes................................................................................................... 16 2.0 pin interface ........................................................................................................... 18 2.1 pin assignments................................................................................................ 18 2.2 pull-up/pull-down resistors and strapping options ................................................ 19 2.3 signal type definition ........................................................................................ 19 2.3.1 pcie ..................................................................................................... 19 2.3.2 nvm port............................................................................................... 20 2.3.3 system management bus (smbus) interface .............................................. 21 2.3.4 nc-si and testability ......... ............ ........... .......... ........... ........... ........ ...... 21 2.3.5 leds .................................................................................................... 22 2.3.6 phy pins ............................................................................................... 22 2.3.7 miscellaneous pin ................................................................................... 23 2.3.8 power supplies and support pins .............................................................. 24 2.4 package ........................................................................................................... 25 3.0 interconnects .......................................................................................................... 26 3.1 pcie ................................................................................................................ 26 3.1.1 architecture, transaction, and link laye r properties ................................... 27 3.1.2 general functionality .......................... .................................................... 28 3.1.3 transaction layer................................................................................... 28 3.1.4 flow control .......................................................................................... 33 3.1.5 host i/f ................................................................................................ 35 3.1.6 error events and error reporting .............................................................. 36 3.1.7 link layer ............................................................................................. 39 3.1.8 phy ...................................................................................................... 40 3.1.9 performance monitoring .......................................................................... 41 3.2 ethernet interface ............................................................................................. 41 3.2.1 mac/phy gmii/mii interface ................................................................... 41 3.2.2 duplex operation for copper phy/gmii/mii operation ................................. 42 3.2.3 auto-negotiation & link setup features .... ................................................ 43 3.2.4 loss of signal/link status indication ....... .................................................. 46 3.2.5 10/100 mb/s specific performance enhancements....................................... 47 3.2.6 flow control .......................................................................................... 48 3.3 spi non-volatile memory interface ............... ....................................................... 51 3.3.1 general overview................................................................................... 51 3.3.2 supported nvm devices .......................................................................... 51 3.3.3 nvm device detection............................................................................. 52 3.3.4 device operation with an external eeprom................................................ 53 3.3.5 device operation with flash................... .................................................. 53 3.3.6 shadow ram ......................................................................................... 53 3.3.7 nvm clients and interfaces ...................................................................... 55 3.3.8 nvm write and erase sequence................................................................ 56 3.4 system management bus (smbus) ...................................................................... 58
82574 gbe controller?datasheet 4 3.5 nc-si...............................................................................................................58 3.5.1 interface specification .............................................................................59 3.5.2 electrical characteristics ..................... .....................................................59 4.0 initialization ............................................................................................................60 4.1 introduction ......................................................................................................60 4.2 reset operation.................................................................................................60 4.3 power up..........................................................................................................62 4.3.1 power-up sequence ................................................................................62 4.3.2 timing diagram......................................................................................70 4.4 global reset (pe_rst_n, pcie in-band reset) ......................................................71 4.4.1 reset sequence......................................................................................71 4.4.2 timing diagram......................................................................................72 4.5 timing parameters.............................................................................................74 4.5.1 timing requirements ..............................................................................74 4.6 software initialization sequence ..........................................................................74 4.6.1 interrupts during initialization..................................................................75 4.6.2 global reset and general configuration .....................................................75 4.6.3 link setup mechanisms and control/status bit summary .............................75 4.6.4 initialization of statistics..................... .....................................................77 4.6.5 receive initialization ...............................................................................77 4.6.6 transmit initialization......................... .....................................................78 5.0 power management and delivery .............................................................................80 5.1 assumptions .....................................................................................................80 5.2 power consumption ...........................................................................................80 5.3 power delivery ..................................................................................................81 5.3.1 the 1.9 v dc rail ....................................................................................81 5.3.2 the 1.05 v dc rail ..................................................................................81 5.4 power management............................................................................................81 5.4.1 82574 power states ................................................................................81 5.4.2 auxiliary power usage .............................................................................82 5.4.3 power limits by certain form factors ...... ..................................................83 5.4.4 power states..........................................................................................83 5.4.5 timing of power-state transitions .............................................................87 5.5 wake up ..........................................................................................................90 5.5.1 advanced power management wake up .....................................................90 5.5.2 pcie power management wake up ............................................................91 5.5.3 wake-up packets....................................................................................91 6.0 non-volatile memory (nvm) map .............................................................................98 6.1 eeupdate ........................................................................................................98 6.2 basic configuration table....................................................................................98 6.2.1 hardware accessed words ..................................................................... 100 6.2.2 software accessed words ...................................................................... 111 6.3 manageability configuration words. ............ ................ ........... ............ ........... ...... 112 6.3.1 smbus apt configuration words ............................................................. 112 6.3.2 nc-si configuration words .................................................................... 114 7.0 inline functions ..................................................................................................... 116 7.1 packet reception ............................................................................................. 116 7.1.1 packet address filtering......................................................................... 116 7.1.2 receive data storage ............................................................................ 117 7.1.3 legacy receive descriptor format........................................................... 117 7.1.4 extended rx descriptor ......................................................................... 120 7.1.5 packet split receive descriptor............................................................... 126
5 datasheet?82574 gbe controller 7.1.6 receive descriptor fetching ................................................................... 129 7.1.7 receive descriptor write back................................................................ 129 7.1.8 receive descriptor queue structure........................................................ 130 7.1.9 receive interrupts................................................................................ 132 7.1.10 receive packet checksum offloading ...................................................... 135 7.1.11 multiple receive queues and receive-side scaling (rss)........................... 137 7.2 packet transmission ........................................................................................ 143 7.2.1 transmit functionality........................... ................................................ 143 7.2.2 transmission flow using simplified legacy descriptors.............................. 144 7.2.3 transmission process flow using extended descriptors.............................. 144 7.2.4 transmit descriptor ring structure ........... .............................................. 145 7.2.5 multiple transmit queues ...................................................................... 147 7.2.6 overview of on-chip transmit modes...................................................... 147 7.2.7 pipelined tx data read requests ............................................................ 148 7.2.8 transmit interrupts .............................................................................. 149 7.2.9 transmit data storage .......................................................................... 149 7.2.10 transmit descriptor formats.................................................................. 150 7.2.11 extended data descriptor format ........................................................... 158 7.3 tcp segmentation ........................................................................................... 162 7.3.1 tcp segmentation performance advantages ............................................ 162 7.3.2 ethernet packet format......................................................................... 162 7.3.3 tcp segmentation data descriptors........................................................ 163 7.3.4 tcp segmentation source data .............................................................. 164 7.3.5 hardware performed updating for each frame ......................................... 164 7.3.6 tcp segmentation use of multiple data descriptors .................................. 165 7.4 interrupts ...................................................................................................... 168 7.4.1 legacy and msi interrupt modes ............................................................ 168 7.4.2 msi-x mode......................................................................................... 168 7.4.3 registers............................................................................................. 169 7.4.4 interrupt moderation ............................................................................ 171 7.4.5 clearing interrupt causes...................................................................... 173 7.5 802.1q vlan support ...................................................................................... 174 7.5.1 802.1q vlan packet format .................................................................. 174 7.5.2 transmitting and receiving 802.1q packets ............................................. 175 7.5.3 802.1q vlan packet filtering ................................................................. 175 7.6 led's............................................................................................................. 176 7.7 time sync (ieee1588 and 802.1as) ................................................................. 177 7.7.1 overview ............................................................................................ 177 7.7.2 flow and hardware/software responsibilitie s ............ ............ ......... .......... 178 7.7.3 hardware time sync elements ............................................................... 180 7.7.4 ptp packet structure ............................................................................ 183 8.0 system manageability ............................................................................................ 186 8.1 scope ............................................................................................................ 186 8.2 pass-through (pt) functionality ................... ..................................................... 186 8.3 components of a sideband interface.................................................................. 187 8.4 smbus pass-through interface.......................................................................... 187 8.4.1 general............................................................................................... 188 8.4.2 pass-through capabilities..... ............ ........... ........... ............ ......... .......... 188 8.4.3 manageability receive filterin g..................... ........... ............ ......... .......... 188 8.4.4 smbus transactions.............................................................................. 196 8.4.5 smbus notification methods ................... ................................................ 200 8.5 receive tco flow............................................................................................ 203 8.6 transmit tco flow .......................................................................................... 203 8.6.1 transmit errors in sequence handling ..................................................... 204
82574 gbe controller?datasheet 6 8.6.2 tco command aborted flow .................................................................. 204 8.7 smbus arp transactions................................................................................... 205 8.7.1 prepare to arp ..................................................................................... 205 8.7.2 reset device (general).......................................................................... 205 8.7.3 reset device (directed) ......................................................................... 205 8.7.4 assign address ..................................................................................... 205 8.7.5 get udid (general and directed)............................................................ 206 8.8 smbus pass-through transactions ..................................................................... 208 8.8.1 write transactions ................................................................................ 208 8.8.2 read transactions (82574 to mc) ........................................................... 213 8.9 smbus troubleshooting .................................................................................... 223 8.9.1 smbus commands are always nack'd by the 82574 ................................. 223 8.9.2 smbus clock speed is 16.6666 khz......................................................... 223 8.9.3 a network based host application is not receiving any network packets ...... 223 8.9.4 status registers ................................................................................... 223 8.9.5 unable to transmit packets from the mc .................................................. 224 8.9.6 smbus fragment size............................................................................ 225 8.9.7 enable xsum filtering ........................................................................... 226 8.9.8 still having problems? ........ ............ ........... ............ ........... ........ ............. 226 8.10 nc-si interface ............................................................................................... 226 8.11 overview ........................................................................................................ 226 8.11.1 terminology......................................................................................... 226 8.11.2 system topology .................................................................................. 228 8.11.3 data transport ..................................................................................... 229 8.12 nc-si support................................................................................................. 231 8.12.1 supported features............................................................................... 231 8.12.2 nc-si mode - intel specific commands.................................................... 232 8.13 basic nc-si workflows ..................................................................................... 237 8.13.1 package states..................................................................................... 237 8.13.2 channel states ..................................................................................... 238 8.13.3 discovery ............................................................................................ 238 8.13.4 configurations...................................................................................... 238 8.13.5 pass-through traffic states.................................................................... 240 8.13.6 asynchronous event notifications............ ................................................ 241 8.13.7 querying active parameters ................................................................... 241 8.14 resets............................................................................................................ 242 8.15 advanced workflows ........................................................................................ 242 8.15.1 multi-nc arbitration .............................................................................. 242 8.15.2 external link control............................................................................. 243 8.15.3 statistics ............................................................................................. 244 9.0 programing interface ............................................................................................. 246 9.1 pcie configuration space.................................................................................. 246 9.1.1 pcie compatibility ................................................................................ 246 9.1.2 mandatory pci configuration registers .................................................... 247 9.1.3 pci power management registers ........................................................... 252 9.1.4 message signaled interrupt (msi) configuration registers.......................... 255 9.1.5 msi-x configuration.............................................................................. 256 9.1.6 pcie configuration registers .................................................................. 259 10.0 driver programing interface .................................................................................. 270 10.1 introduction .................................................................................................... 270 10.1.1 memory and i/o address decoding ......................................................... 270 10.1.2 registers byte ordering......................................................................... 273 10.1.3 register conventions ............................................................................ 274 10.2 configuration and status registers - csr space .................................................. 274
7 datasheet?82574 gbe controller 10.2.1 register summary table ....................................................................... 274 10.2.2 general register descriptions ................................................................ 281 10.2.3 pcie register descriptions..................................................................... 300 10.2.4 interrupt register descriptions............................................................... 308 10.2.5 receive register descriptions ................................................................ 315 10.2.6 transmit register descriptions ............................................................... 332 10.2.7 statistic register descriptions ................................................................ 340 10.2.8 management register descriptions ......................................................... 355 10.2.9 time sync register descriptions............................................................. 365 10.2.10msi-x register descriptions................................................................... 368 10.2.11phy registers ...................................................................................... 370 10.2.12diagnostic register descriptions............................................................. 399 11.0 diagnostics ............................................................................................................ 404 11.1 introduction ................................................................................................... 404 11.2 fifo pointer accessibility........... ............ ........... .......... ........... .......... ........... ...... 404 11.3 fifo data accessibility..................................................................................... 404 11.4 loopback operations ....................................................................................... 405 12.0 electrical specifications ......................................................................................... 406 12.1 introduction ................................................................................................... 406 12.2 voltage regulator power supply specification ..................................................... 406 12.2.1 3.3 v dc rail ........................................................................................ 406 12.2.2 1.9 v dc rail ....................................................................................... 406 12.2.3 1.05 v dc rail ...................................................................................... 407 12.2.4 pnp specifications ............................................................................... 407 12.3 power sequencing ........................................................................................... 408 12.4 power-on reset .............................................................................................. 408 12.5 power scheme solutions .................................................................................. 409 12.6 discrete/integrated magnetics specifications..... .................................................. 412 12.7 oscillator/crystal specifications.. ............ ........... .......... ........... .......... ........... ...... 413 12.8 i/o dc parameters .......................................................................................... 414 12.8.1 test, jtag and nc-si ........................................................................... 415 12.8.2 leds .................................................................................................. 415 12.8.3 smbus ................................................................................................ 416 13.0 design considerations ........................................................................................... 418 13.1 pcie .............................................................................................................. 418 13.1.1 port connection to the 82574................................................................. 418 13.1.2 pcie reference clock ............................................................................ 418 13.1.3 other pcie signals ............................................................................... 418 13.1.4 pcie routing ....................................................................................... 419 13.2 clock source .................................................................................................. 419 13.2.1 frequency control device design considerations ...................................... 419 13.2.2 frequency control component types ...................................................... 419 13.3 crystal support............................................................................................... 421 13.3.1 crystal selection parameters ................................................................. 421 13.3.2 crystal placement and layout recommendat ions...................................... 424 13.4 oscillator support..... ...................... ............ ........... ........... ............ ......... .......... 425 13.4.1 oscillator placement and layout recommen dations.......... .......... ........... .... 426 13.5 ethernet interface ........................................................................................... 426 13.5.1 magnetics for 1000 base-t.................................................................... 426 13.5.2 magnetics module qualification steps ...................................................... 427 13.5.3 third-party magnetics manufacturers ...................................................... 427 13.5.4 layout considerations for the ethernet interface ...................................... 427 13.5.5 physical layer conformance testing ....................................................... 433
82574 gbe controller?datasheet 8 13.5.6 troubleshooting common physical layout i ssues ...................................... 433 13.6 smbus and nc-si ............................................................................................ 434 13.6.1 nc-si electrical interface requirements................................................... 435 13.7 82574 power supplies ...................................................................................... 439 13.7.1 82574 gbe controller power sequencing.................................................. 439 13.7.2 power and ground planes ...................................................................... 441 13.8 device disable................................................................................................. 441 13.8.1 bios handling of device disable ............................................................. 442 13.9 82574 exposed pad* ........................................................................................ 442 13.9.1 introduction ......................................................................................... 442 13.9.2 component pad, solder mask and solder paste ......................................... 443 13.9.3 landing pattern a (no via in pad) ........................................................... 444 13.9.4 landing pattern b (thermal relief; no via in pad)..................................... 445 13.10 xor testing .................................................................................................... 446 14.0 thermal design considerations .............................................................................. 448 14.1 introduction .................................................................................................... 448 14.2 intended audience ........................................................................................... 448 14.3 measuring the thermal conditions ..................................................................... 448 14.4 thermal considerations .................................................................................... 448 14.5 packaging terminology..................................................................................... 449 14.6 product package thermal specification ............................................................... 449 14.7 thermal specifications...................................................................................... 450 14.7.1 case temperature ................................................................................ 450 14.7.2 designing for thermal performance......................................................... 450 14.8 thermal attributes ........................................................................................... 451 14.8.1 typical system definitions .................. ................................................... 451 14.9 82574 package thermal characteristics .............................................................. 452 14.10 reliability ............... ............ ........... ............ ........... .......... ........... ........ ............. 452 14.11 measurements for thermal specifications............................................................ 453 14.12 case temperature measurements ...................................................................... 453 14.12.1attaching the thermocouple................................................................... 454 14.13 conclusion ...................................................................................................... 454 14.14 pcb guidelines ................................................................................................ 455 15.0 board layout and schematic checklists ................................................................. 456 16.0 models ................................................................................................................... 466 17.0 reference schematics ............................................................................................ 468
9 datasheet?82574 gbe controller revision history date revision description february 2009 2.4 ? updated sections 6.3.1.3, 10.2.3.11, and 10.2.8.8. ? updated table 66. december 2008 2.3 ? added section 8.12.2.3 - set intel management control formats. ? added section 8.12.3.4 - get intel management control formats. ? added section 10.2.3.12 - 3gpio control register 2 - gcr2. ? updated section 13.1.4 - pcie routing. ? updated section 13.10 - added ?the xor tree is output on the led1 pin?. ? updated table 97 - schematic checklist. october 2008 2.2 ? changed pcie rev. 2.0 (2.5 ghz) x1 to pc ie rev. 1.1 (2.5 ghz) x1 in section 1.0. ? added multi-drop application connectivity requirements to section 13.6.1.2. august 2008 2.1 ? updated title page - changed packet buffer size from 32 kb to 40 kb. ? updated section 15 - corrected nc-si schematic checklist information. ? updated reference schematics - corrected nc-si schematic information. june 2008 2.0 initial public release. february 2008 1.7 ? updated section 5.2. ? added a note to table 31. ? updated section 13.5.5.13. ? added 82574it ordering information. february 2008 1.6 ? quick fix provided which added measured power consumption (section 5.2). this is a temporary patch. note that the fix does not appear in the toc or list of tables yet. this will be corrected next week. january 2008 1.5 ? changed section 10.2.2.2 bit 31 assignment from 1b to 0b. ? changed word 0x0f bit 7 bit assignment (1b to 0b). ? added new section 14 ?thermal design considerations?. ? updated mng mode description (loads from nvm work 0xf instead of word 10. ? updated the 82574l resets table. ? added note ?the 82574l requests i/o resources to support pre-boot operation (prior to the allocation of physical memory base addresses?. ? updated cap offset 0xe4 bit 15 description. ? updated default values for uncorrectable error severity and correctable error mask registers. ? updated figure 52. ? updated value1 and value2 byte numbers in section 10.2.8.19. ? changed crystal drive level to 300 ? w. ? changed all 1.0 v dc references to 1.05 v dc. ? changed all 1.8 v dc references to 1.9 v dc. ? deleted ?default value of 0x5f20 and 0x5f28 are loaded from the nvm at power up" from the fflt register description. ? added a note for eitr that in 10/100 mb/s mo de, the interval time is multiplied by four. ? updated the type and internal/external pu/pd for nc-si pins. ? updated the nvmt pinout description. ? updated mng_mode to be loaded from nvm word 0x0f (instead of nvm word 0x10). ? updated default values for uncorrectable error severity and correctable error mask registers. ? updated section 9.1.6.1.7. where applicable, changed milliseconds to micro seconds (bits 14:12 and 17:15). ? removed wupl register information. ? noted that manageability can be supported with a 32 kb eeprom. november 2007 1.1 ? updated nvmt symbol description in section 2.3.4, table 10. october 2007 1.0 ? updated sections 2, 3, 4, 5, 9, 12, and 13; as indicated by the change bars in the left margin. august 2007 0.7 ? updated sections 2, 3, 5, 6, 8, 10, and 12. ? added sections 13, 14, 15, and 16. july 2007 0.6 ? added section 12.0 ?electrical specifications?. ? updated section 2.0 ?pin interface?. june 2007 0.5 initial release (intel confidential).
82574 gbe controller?introduction 10 1.0 introduction the 82574 family (82574l and 82574it) are single, compact, low power components that offer a fully-integrated gigabit ethernet media access control (mac) and physical layer (phy) port. the 82574 uses the pci expr ess* (pcie*) architecture and provides a single-port implementation in a relatively sm all area so it can be used for server and client configurations as a lan on motherbo ard (lom) design. the 82574 family can also be used in embedded applications such as switch add-on cards and network appliances. external interfaces provided on the 82574: ? pcie rev. 1.1 (2.5 ghz) x1 ? mdi (copper) standard ieee 802.3 ethernet interface for 1000base-t, 100base- tx, and 10base-t applications (802.3, 802.3u, and 802.3ab) ? nc-si or smbus connection to a manageability controller (mc) ? ieee 1149.1 jtag (note that bsdl testing is not supported) additional product details: ? 9 mm x 9 mm 64-pin qfn package ? support for pci 3.0 vital product data (vpd) ? ipmi mc pass through; multi-drop nc-si ? timesync offload compliant with 802.1as specification 1.1 scope this document presents the architecture (i ncluding device operation, pin descriptions, register definitions, etc.) for the 82574. this document is intended to be a reference for software device driver developers, board designers, test engineers, or others who might need specific technical or programming information about the 82574. 1.2 number conventions unless otherwise specified, numbers are represented as follows: ? hexadecimal numbers are identified by an "0x" suffix on the number (0x2a, 0x12). ? binary numbers are identified by a "b" suffix on the number (0011b). however, values for smbus transactions in diagrams are listed in binary without the "b" or in hexadecimal without the "0x". any other numbers without a suffix are intended as decimal numbers.
11 introduction?82574 gbe controller 1.3 acronyms following are a list of acronyms that are used throughout this document. acronym definition ack acknowledge. ara smbus alert response address. arp address resolution protocol. asf alert standard format. the manageability pr otocol specification defined by the dmtf. mc manageability controller. the general name fo r an external tco co ntroller, relevant only in tco mode. csr control and status register. usually refers to a hardware register. dhcp dynamic host configuration protocol. a tcp/ip protocol that enables a client to receive a temporary ip address over the network from a remote server. dmtf the international organization responsibl e for managing and maintaining the asf specification. ieee institute of electrical and electronics engineers. ip internet protocol. the protocol within tcp/ip that governs the breakup and reassembly of data messages into packets and the packet routing within the network. ip address the 4-byte or 16-byte address that designates the ethernet controller within the ip communication protocol. this address is dynamic and can be updated frequently during runtime. ipmi intelligent platform manage ment interface specification. lan local area network. also known as the ethernet. mac address the 6-byte address that design ates ethernet controller within the ethernet protocol. this address is constant and unique per ethernet controller. na not applicable. nack not acknowledged. nc-si network controller sideband interface. new dmtf industry standard sideband interface. nic network interface card. generic name for a ethernet controller that resides on a printed circuit board (pcb). os operating system. usually designates the pc system?s software. pec the smbus checksum signature, sent at the end of an smbus packet. an smbus device can be configured either to require or not require this signature. pet platform event trap. pt pass-through. also known as tco mode. psa smbus persistent slave address device. in the smbus 2.0 specification, this designates an smbus device whose addr ess is stored in non-volatile memory. rmcp remote management and control protocol. rsp rmcp security extensions protocol. sa security association.
82574 gbe controller?introduction 12 1.4 reference documents other reference documents include: ? intel? 82574 family gbe controller spec ification update, intel corporation. ? pci express* specification v2.0 (2.5 gt/s) ? advanced configuration and power interface specification ? pci bus power management interface specification smbus system management bus. snmp simple network management protocol. tco total cost of ownership. tbd to be defined. acronym definition document name version owner location smbus specification 2.0 sbs forum http://www.smbus.org/ i 2 c specification 2.1 phillips semiconductors http://www.philipslogic.com/ nc-si specification 1.0 dmtf http://www.dmtf.org/ search for nc-si.
13 introduction?82574 gbe controller 1.5 82574 architecture block diagram figure 1 shows a high-level architecture block diagram for the 82574. figure 1. 82574 architecture block diagram 1.6 system interface the 82574 provides one pcie lane operating at 2.5 ghz with sufficient bandwidth to support 1000 mb/s transfer rate. 40 kb of on-chip buffering mitigates instantaneous receive bandwidth demands and eliminates tr ansmit under?runs by buffering the entire outgoing packet prior to transmission. 1.7 features summary this section describes the 82574?s features that were present in previous intel client gbe controllers and those features that are new to the 82574. pcie i/f rx/tx dma rx/tx fifo transmit switch filter mac phy rmii i/f smbus i/f nc-si rx/tx fifo rmii smbus pcie link
82574 gbe controller?introduction 14 table 1. network features table 2. host interface features feature 82574 83573l compliant with the 1 gb/s ethernet 802.3 802.3u 802.3ab specifications yy multi-speed operation: 10/100/1000 mb/s y y full-duplex operation at 10/100/1000 mb/s y y half-duplex operation at 10/100 mb/s y y flow control support compliant with the 802.3x specification yy vlan support compliant with the 802.3q specification yy mac address filters: perfect match unicast filters; multicast hash filtering, broadcast filter and promiscuous mode yy configurable led operation for oem customization of led displays yy statistics for management and rmon y y mac loopback y y feature 82574 83573l pcie interface to chipset y y 64-bit address master support for systems using more than 4 gb of physical memory yy programmable host memory receive buffers (256 bytes to 16 kb) yy intelligent interrupt gene ration features to enhance software device driver performance yy descriptor ring management hardware for transmit and receive yy software controlled reset (resets everything except the configuration space) yy message signaled interrupts (msi) y y msi-x y n
15 introduction?82574 gbe controller table 3. manageab ility features table 4. performance features table 5. power mana gement features feature 82574 83573l nc-si over rmii for remote management core y n smbus advanced pass through y n feature 82574 83573l configurable receive and transmit data fifo; programmable in 1 kb increments yy tcp segmentation capability compatible with nt 5.x tcp segmentation offload (tso) features yy supports up to 256 kb tso (tso v2) y n fragmented udp checksum offload for packet re- assembly yy ipv4 and ipv6 checksum offload support (receive, transmit, and tso) yy split header support y y receive side scaling (rss) with two hardware receive queues yn supports 9018-byte jumbo packets y y packet buffer size 40 kb 32 kb timesync offload compliant with 802.1as specification yn feature 82574 83573l magic packet wake-up enable with unique mac address yy acpi register set and power down functionality supporting d0 and d3 states yy full wake-up support (apm and acpi 2.0) y y smart power down at s0 no link and sx no link y y lan disable functionality y y
82574 gbe controller?introduction 16 1.8 product codes ta b l e 6 lists the product ordering codes for the 82574 family. table 6. product ordering codes part number product name description WG82574L intel? 82574l gigabit network connection ? embedded and entry server gbe lan. ? operates using a standard temperature range (0 c to 85 c). wg82574it intel? 82574it gigabit network connection ? embedded and entry server gbe lan. ? operates using a wider temperature range (-40 c to 85 c).
17 introduction?82574 gbe controller note: this page intentionally left blank.
82574 gbe controller?pin interface 18 2.0 pin interface 2.1 pin assignments the 82574 supports a 64-pin, 9 x 9 qfn package with an exposed pad* (e-pad*). note that the e-pad is ground. figure 2. 82574 64-pin, 9 x 9 qfn package with e-pad ctrl19 avdd3p3/vdd3p3 nc_si_clk_in nc_si_crs_dv vdd1p0 nc_si_rxd1 nc_si_rxd0 nc_si_tx_en nc_si_txd1 nc_si_txd0 vdd3p3 vdd1p0 nvm_si nvm_sk nvm_so nvm_cs_n rset avdd1p9 ctrl10 avdd1p9 smb_dat nvmt/jtag_tms dis_reg10 aux_pwr/jtag_tck vdd1p0 vdd1p9 jtag_tdi xtal1 atest_n avdd1p9 mdi_minus[0] mdi_plus[0] mdi_minus[1] mdi_plus[1] mdi_minus[2] mdi_plus[2] mdi_minus[3] mdi_plus[3] avdd1p9 avdd1p9 12345678910111213141516 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 51 50 52 53 54 55 56 57 58 59 60 61 62 63 64 led2 atest_p avdd1p9 xtal2 vdd1p0 smb_clk smb_alrt_n pe_wake_n/jtag_tdo led0 led1 vdd3p3 pe_tp avdd1p9 pe_rn pe_rp peclkn peclkp vdd1p0 dev_off_n test_en pe_rst_n vdd1p0 pe_tn 82574 64 pin qfn 9 mm x 9 mm 0.5 mm pin pitch with exposed pad* vdd1p0
19 pin interface?82574 gbe controller 2.2 pull-up/pull-down resistors and strapping options ? as stated in the name and function tabl e columns, the internal pull-up/pull-down (pu/pd) resistor values are 30 k ? 50%. ? only relevant (digital) pins are listed; anal og or bias and power pins have specific considerations listed in section 12.0 . ? nvmt and aux_pwr are used for a static configuration. they are sampled while pe_rst_n is active and latched when pe_rst_n is deasserted. at other times, they revert to their standard usage. 2.3 signal type definition 2.3.1 pcie in input is a standard input-only signal. out (o) totem pole output is a standard active driver. t/s tri-state is a bi-directional, tri-state input/output pin. s/t/s sustained tri-state is an active low tri-state si gnal owned and driven by one and only one agent at a time. the agent that drives an s/t/s pin low must drive it high for at least one clock before letting it float. a new agent cannot start driving an s/t/s signal any sooner than one clock after the previous owner tri-states it. o/d open drain enables multiple devices to share as a wire-or. a-in analog input signals. a-out analog output signals. b input bias. nc-si_in nc-si input signal. nc-si_out nc-si output signal table 7. pcie symbol lead # type op mode name and function peclkp peclkn 26 25 a-in input pcie differential reference clock in this pin receives a 100 mhz diffe rential clock input. this clock is used as the reference clock for the pcie tx/rx circuitry and by the pcie core pll to generate a 125 mhz clock and 250 mhz clock for the pcie core logic. pe_tp pe_tn 21 20 a-out output pcie serial data output serial differential output link in the pcie interface running at 2.5 gb/s. this output carries both data and an embedded 2.5 ghz clock that is recovered along with data at the receiving end.
82574 gbe controller?pin interface 20 2.3.2 nvm port pe_rp pe_rn 24 23 a-in input pcie serial data input serial differential input link in the pcie interface running at 2.5 gb/s. the embedded clock present in this input is recovered along with the data. pe_wake_n/ jtag_tdo 16 o/d output wake the 82574 drives this signal to zero when it detects a wake- up event and either: ? the pme_en bit in pmcsr is 1b or ? the apme bit of the wake up control (wuc) register is 1b. jtag tdo output. pe_rst_n 17 in input power and clock good indication the pe_rst_n signal indicates that both pcie power and clock are available. table 8. nvm port symbol lead # type op mode name and function nvm_si 12 t/s output serial data output connect this lead to the input of the non-volatile memory (nvm). note: the nvm_si port pin incl udes an internal pull-up resistor. nvm_so 14 t/s input serial data input connect this lead to the output of the nvm. note: the nvm_so port pin includes an internal pull-up resistor. nvm_sk 13 t/s output non-volatile memory serial clock note: the nvm_sk port pin includes an internal pull-up resistor. nvm_cs_n 15 t/s output non-volatile memory chip select output note: the nvm_cs port pin incl udes an internal pull-up resistor. table 7. pcie symbol lead # type op mode name and function
21 pin interface?82574 gbe controller 2.3.3 system management bus (smbus) interface note: if the smbus is disconnected, an external pu ll-up should be used for these pins, unless it is guaranteed that manageability is disabled in the 82574. 2.3.4 nc-si and testability table 9. smbus interface symbol lead # type op mode name and function smb_dat 36 t/s, o/d bi-dir smbus data stable during the high period of the clock (unless it is a start or stop condition). smb_clk 34 t/s, o/d bi-dir smbus clock one clock pulse is generated for each data bit transferred. smb_alrt_n 35 t/s, o/d output smbus alert acts as an interrupt pin of a slave device on the smbus in pass-through mode. table 10. nc-si and testability symbol lead # type op mode name and function nc_si_clk_in 2 nc-si_ in input nc-si reference clock input synchronous clock reference for receive, transmit, and control interface. this signal is a 50 mhz clock +/- 50 ppm. note: if not used, should have an external pull-down resistor. also, this clock is in addition to and separate from the xtal clock. nc_si_crs_dv 3 nc-si_ out output nc-si carrier sense/receive data valid (crs/dv). nc_si_rxd0 6 nc-si_ out output nc-si receive data 0 data signals to the manageability controller (mc). nc_si_rxd1 5 nc-si_ out output nc-si receive data 1 data signals to the mc. nc_si_tx_en 7 nc-si_ in input nc-si transmit enable note: if not used, should have an external pull-down resistor. nc_si_txd0 9 nc-si_ in input nc-si transmit data 0 data signals from the mc note: if not used, should have an external pull-up resistor. nc_si_txd1 8 nc-si_ in input nc-si transmit data 1 data signal from the mc note: if not used, should have an external pull-up resistor. test_en 29 in input enables test mode test pins are overloaded on the functional signals as described in the pin description text of th is section. the pin is active high. note: this pin should be extern ally pulled down for normal operation.
82574 gbe controller?pin interface 22 2.3.5 leds ta b l e 1 1 lists the functionality of each led output pin. the default activity of each led can be modified in the nvm. the led functionality is reflected and can be further modified in the configuration registers (ledctl). 2.3.6 phy pins note: the 82574 has built in termination resistors. as a result, external termination resistors should not be used. aux_pwr/ jtag_tck 39 in input auxiliary power indication. aux_pwr is supported when sampled high and should be connected using a resistor jtag clock input note: the aux_pwr/jtag_tck port pin includes an internal pull-down resistor. nvmt/jtag_tms 38 in input nvm type the nvm is flash when sampled low and eeprom when sampled high . jtag tms input. note: the nvmt/jtag_tms port pin includes an internal pull- up resistor. also note that the internal pull-up is disconnected during startup. as a result, nvmt must be connected externally. jtag_tdi 40 in input jtag tdi input note: the jtag_tdi port pin incl udes an internal pull-up resistor. table 10. nc-si and testability symbol lead # type op mode name and function table 11. leds symbol lead # type op mode name and function led0 31 out output led0 programmable led. led1 30 out output led1 programmable led. led2 33 out output led2 programmable led.
23 pin interface?82574 gbe controller 2.3.7 miscellaneous pin table 12. phy pins symbol lead # type op mode name and function mdi_plus[0] mdi_minus[0] 58 57 abi-dir media dependent interface[0]: 1000base-t: in mdi configuration, mdi[0]+/-corresponds to bi_da+/- and in mdi-x configuration mdi[0]+/- corresponds to bi_db+/-. 100base-tx: in mdi configuration, mdi[0]+/- is used for the transmit pair and in mdix configuratio n mdi[0]+/- is used for the receive pair. 10base-t: in mdi configuration, mdi[0]+/- is used for the transmit pair and in mdi-x configuratio n mdi[0]+/- is used for the receive pair. mdi_plus[1] mdi_minus[1] 55 54 abi-dir media dependent interface[1]: 1000base-t: in mdi configuration, mdi[1]+/- corresponds to bi_db+/- and in mdi-x configuration mdi[1]+/- corresponds to bi_da+/-. 100base-tx: in mdi configuration, mdi[1]+/- is used for the receive pair and in mdi-x configuratio n mdi[1]+/- is used for the transmit pair. 10base-t: in mdi configuration, mdi[1]+/- is used for the receive pair and in mdi-x configuratio n mdi[1]+/- is used for the transmit pair. mdi_plus[2] mdi_minus[2] mdi_plus[3] mdi_minus[3] 53 52 50 49 abi-dir media dependent interface[3:2]: 1000base-t: in mdi and in mdi-x co nfiguration, mdi[2]+/- corresponds to bi_dc+/- and mdi[3]+/- corresponds to bi_dd+/-. 100base-tx: unused. 10base-t:unused. xtal1 xtal2 43 42 a-in a-out input/ output xtal in/out these pins can be driven by an external 25 mhz crystal or driven by an external mos level 25 mhz oscillator. used to drive the phy. atest_p atest_n 45 46 a-out output positive side of the high speed differential debug port for the phy. rset 48 a bias phy termination this pin should be connected through a 4.99 k ? +-1% resister to ground. table 13. miscellaneous pin symbol lead # type op mode name and function dev_off_n 28 in input this is a 3.3 v dc input signal. asserting dev_off_n puts the 82574 in device disable mode. note that this pin is asynchronous.
82574 gbe controller?pin interface 24 2.3.8 power supplies and support pins 2.3.8.1 power support 2.3.8.2 power supply table 14. power support symbol lead # type / voltage name and function ctrl10 62 a-out 1.05 v dc control voltage control for an external 1.05 v dc pnp. ctrl19 64 a-out 1.9 v dc control voltage control for an external 1.9 v dc pnp. dis_reg10 59 a-in disable 1.05 v dc regulator when high, the internal 1.05 v dc regulator is disabled and the ctrl10 signal is active. when low, the internal 1.05 v dc regulator is enabled using its inte rnal power transistor. in this case, the ctrl10 signal is inactive. table 15. power supply symbol lead # type / voltage name and function vdd1p0 4, 11, 18, 27, 37, 41, 60 1.05 v dc 1.05 v dc power supply (7). avdd1p9 22, 44, 47, 51, 56, 61, 63 1.9 v dc 1.9 v dc power supply (7). vdd3p3 10, 32 3.3 v dc 3.3 v dc power supply (2). avdd3p3/ vdd3p3 1 3.3 v dc 3.3 v dc power supply (1). vdd1p9 19 1.9 v dc fuse voltage for programming on-die fuses. connect to 1.9 v dc for normal operation. gnd e-pad ground the e-pad metal connection on the bottom of the package. should be connected to ground.
25 pin interface?82574 gbe controller 2.4 package the 82574 supports a 64-pin, 9 x 9 qfn package with e-pad. figure 3 shows the package schematics. figure 3. 82574 qfn 9 x 9 mm package
82574 gbe controller?interconnects 26 3.0 interconnects 3.1 pcie pcie is a third generation i/o architectu re that enables cost competitive, next generation i/o solutions providing industry leading price/performance and feature richness. it is an industry-driven specification. pcie defines a basic set of requirements that comprehends the majority of the targeted application classes. high-end application requ irements such as enterprise class servers and high-end communication platforms are delivered by a set of advanced extensions that compliment the baseline requirements. to guarantee headroom for future app lications of pcie, a software-managed mechanism for introducing new, enhanced capabilities in the platform is provided. figure 4 shows the pcie architecture. figure 4. pcie stack structure the pcie physical layer consists of a differen tial transmit pair and a differential receive pair. full-duplex data on these two point-to-p oint connections is self-clocked such that no dedicated clock signals are required. note: the bandwidth of this interface increases linearly with frequency. 2.5+ 2.5+ gb gb /s /s pci.sys compliant pci.sys compliant configurable widths 1 .. 32 configurable widths 1 .. 32 preserve driver model preserve driver model config/os s/w protocol link physical common base protocol common base protocol advanced advanced xtensions xtensions physical (electrical mechanical) point to point, serial, differential, point to point, serial, differential, hot hot - - plug, inter plug, inter - - op op formfactors formfactors
27 interconnects?82574 gbe controller a packet is the fundamental unit of information exchange and the protocol includes a message space to replace the number of side-band signals found on many of today?s buses. this movement of hard-wired signals from the physical layer to messages within the transaction layer enables easy and lin ear physical layer width expansion for increased bandwidth. the common base protocol uses split transactions along with several mechanisms that are included to eliminate wait states and to optimize the reordering of transactions to further improve system performance. 3.1.1 architecture, transaction, and link layer properties ? split transaction, packet-based protocol ? common flat address space for load/sto re access (such as a pci addressing model): ? memory address space of 32 bits to enable compact packet header (must be used to access addresses below 4 gb) ? memory address space of 64 bits using extended packet header ? transaction layer mechanisms: ? pci-x style relaxed ordering ? optimizations for no-snoop transactions ? credit-based flow control ? packet sizes/formats: ? maximum packet size supports 128- and 256-byte data payload ? maximum read request size of 4 kb ? reset/initialization: ? frequency/width/profile nego tiation performed by hardware ? data integrity support: ? using crc-32 for transaction layer packets ? link layer retry for recovery following error detection: ? using crc-16 for link layer messages ? no retry following error detection: ? 8b/10b encoding with running disparity ? software configuration mechanism: ? uses pci configuration and bus enumeration model ? pcie-specific configuration registers mapped via pci extended capability mechanism ? baseline messaging: ? in-band messaging of formerly side-band legacy signals (such as interrupts) ? system-level power manageme nt supported via messages ? power management (pm): ? full pci pm support ? wake capability from d3cold state ? compliant with acpi 2.0, pci pm software model ? active state power management (transparent to software including acpi)
82574 gbe controller?interconnects 28 3.1.1.1 physical interface properties ? point to point interconnect ? full-duplex; no arbitration ? signaling technology: ? low voltage differential ? embedded clock signaling using 8b/10b encoding scheme ? serial frequency of operation: 2.5 ghz. ? interface width of one lane per direction ? dft and dfm support for high volume manufacturing 3.1.1.2 advanced extensions pcie defines a set of optional features to enhance platform capabilities for specific usage modes. the 82574 supports the following optional features: ? extended error reporting ? messaging support to communicate multiple types/ severity of errors ? serial number 3.1.2 general functionality ? native/legacy: ? the pcie capability register states the device/port type. ? the 82574 is a native device by default. ? locked transactions: ? the 82574 does not support locked requests as a target or master. ? end to end crc (ecrc): ? not supported by the 82574 3.1.3 transaction layer the upper layer of the pcie architecture is the transaction layer. the transaction layer connects to the 82574?s core using an implementation-specific protocol. through this core-to-transaction-layer protocol, the applic ation-specific parts of the 82574 interact with the pcie subsystem and transmit and receive requests to or from the remote pcie agent, respectively. 3.1.3.1 transaction types receiv ed by the transaction layer table 16. transaction types at the rx transaction layer transaction type fc type tx later reaction hardware should keep data from original packet for client configuration read request nph cplh + cpld requester id, tag, attribute configuration space configuration write request nph + npd cplh requester id, tag, attribute configuration space memory read request nph cplh + cpld requester id, tag, attribute csr
29 interconnects?82574 gbe controller flow control types: ? ph - posted request headers ? pd - posted request data payload ? nph - non-posted request headers ? npd - non-posted request data payload ? cplh - completion headers ? cpld - completion data payload 3.1.3.2 transaction types initiated by the 82574 table 17. transaction types at the tx transaction layer 3.1.3.3 message handling by the 82574 (as a receiver) message packets are special packets that carry a message code. the upstream device transmits special messages to the 82574 by using this mechanism. the transaction layer decodes the message code and responds to the message accordingly. memory write request ph + pd - - csr i/o read request nph cplh + cpld requester id, tag, attribute csr i/o write request nph + npd cplh requester id, tag, attribute csr read completions cplh + cpld -- dma message ph - - message unit / int / pm / error unit transaction type fc type tx later reaction hardware should keep data from original packet for client transaction type payload size fc type from client configuration read request completion dword cplh + cpld configuration space configuration write request completion - cplh configuration space i/o read request completion dword cplh + cpld csr i/o write request completion - cplh csr read request completion dword/qword cplh + cpld csr memory read request - nph dma memory write request <= max_payload_size 1 1. the max_payload_size supported is loaded from the nvm (either 128 bytes or 256 bytes). effective max_payload_size is according to configuration space register. ph + pd dma message - ph message unit / int / pm / error unit
82574 gbe controller?interconnects 30 table 18. supported message in the 82574 (as a receiver) 3.1.3.4 message handling by the 82574 (as a transmitter) the transaction layer is also responsible for transmitting specific messages to report internal/external events (such as interrupts and pmes). table 19. supported message in the 82574 (as a transmitter) message code [7:0] routing r2r1r0 message device?s later response 0x14 100 pm_active_state_nak internal signal set 0x19 011 pme_turn_off internal signal set 0x41 100 attention_indicator_on silently drop 0x43 100 attention_indicator_blink silently drop 0x40 100 attention_indicator_off silently drop 0x45 100 power_indicator_on silently drop 0x47 100 power_indicator_blink silently drop 0x44 100 power_indicator_off silently drop 0x50 100 slot power limit support (has one dword data) silently drop 0x7e 010,011,100 vendor_defined type 0 no data unsupported request - nec* 0x7e 010,011,100 vendor_defined type 0 data unsupported request - nec* 0x7f 010,011,100 vendor_defined type 1 no data silently drop 0x7f 010,011,100 vendor_defined type 1 data silently drop 0x00 011 unlock silently drop message code [7:0] routing r2r1r0 message 0x20 100 assert int a 0x21 100 assert int b 0x22 100 assert int c 0x23 100 assert int d 0x24 100 de- assert int a 0x25 100 de- assert int b 0x26 100 de- assert int c 0x27 100 de- assert int d 0x30 000 err_cor 0x31 000 err_nonfatal 0x33 000 err_fatal 0x18 000 pm_pme 0x1b 101 pme_to_ack
31 interconnects?82574 gbe controller 3.1.3.5 data alignment 4 kb boundary: requests must never specify an address/le ngth combination that causes a memory space access to cross a 4 kb boundary. it is hardware?s responsibility to break requests into 4 kb-aligned requests (if needed). this does not pose any requirement on software. however, if software allocates a buffer across a 4 kb boundary, hardware then issues multiple requests for the buffer. software should consider aligning buffers to a 4 kb boundary in cases where it improves performance. the alignment to the 4 kb boundaries is do ne in the core. the transaction layer does not do any alignment according to these boundaries. 64 bytes: it is also recommended that requests are multiples of 64 bytes and aligned to make better use of memory controller resource s. this is also done in the core. 3.1.3.6 configuration request retry status the 82574 might have a delay in initializatio n due to an nvm read. the pcie defined a mechanism for devices that require completion of a lengthy self-initialization sequence before being able to serv ice configuration requests. if the read of the pcie section in th e nvm was not completed before the 82574 received a configuration request, then the 82574 responds with a configuration request retry completion status to terminate the request, and effectively stalls the configuration request until such time that the subsystem has completed local initialization and is ready to communicate with the host. 3.1.3.7 ordering rules the 82574 meets the pcie ordering rules (pci-x rules) by following the pci simple device model: ? deadlock avoidance - master and target accesses are independent - the response to a target access does not depend on the status of a master request to the bus. if master requests are blocked (such as due to no credits), target completions can still proceed (if credits are available). ? descriptor/data ordering - the 82574 does not proceed with some internal actions until respective data writes have ended on the pcie link: ? the 82574 does not update an internal header pointer until the descriptors that the header pointer relates to are written to the pcie link. ? the 82574 does not issue a descriptor write until the data that the descriptor relates to is written to the pcie link. the 82574 can issue the following master read request from each of the following clients: ? rx descriptor read (one per queue) ? tx descriptor read (one per queue) ? tx data read (up to four including one for manageability) completed separate read requests are not gu aranteed to return in order. completions for a single read request are guaran teed to return in address order.
82574 gbe controller?interconnects 32 3.1.3.8 transaction attributes 3.1.3.8.1 traffic class (tc) and virtual channels (vc) the 82574 supports only tc = 0 and vc = 0 (default). 3.1.3.8.2 relaxed ordering the 82574 takes advantage of the relaxed orderi ng rules in pcie by setting the relaxed ordering bit in the packet header. the 82574 also enables the system to optimize performance in the following cases: ? relaxed ordering for descriptor and data reads: when the 82574 is a master in a read transaction, its split completion has no relationship with the writes from the cpus (same direction). it should be allo wed to bypass the writes from the cpus. ? relaxed ordering for receiving data writes: when the 82574 masters receive data writes, it also enables them to bypass each other in the path to system memory because the software does not process this data until their associated descriptor writes have been completed. ? the 82574 cannot perform relax ordering for descriptor writes or an msi write. relaxed ordering can be used in conjunction with the no-snoop attribute to enable the memory controller to advance non-snoop writes ahead of earlier snooped writes. relaxed ordering is enabled in the 82574 by setting the ro_dis bit to 0b in the ctrl_ext register. 3.1.3.8.3 snoop not required the 82574 sets the snoop not required attribute bit for master data writes. system logic can provide a separate path into system memory for non-coherent traffic. the non-coherent path to system memory prov ides higher, more uniform, bandwidth for write requests. the snoop not required attribute bit does not alter transaction ordering. therefore, to achieve maximum benefit from snoop not required transactions, it is advisable to set the relaxed ordering attribute as well (assuming that system logic supports both attributes). software configures no-snoop support through the 82574?s control register and a set of nonsnoop bits in the gcr register in the csr sp ace. the default value for all bits is disabled. the 82574 supports a no-snoop bit for each relevant dma client: 1. txdscr_nosnoop - transmit descriptor read. 2. txdscw_nosnoop - transmit descriptor write. 3. txd_nosnoop - transmit data read. 4. rxdscr_nosnoop - receive descriptor read. 5. rxdscw_nosnoop - receive descriptor write. 6. rxd_nosnoop - receive data write. all pcie functions in the 82574 are controlled by this register.
33 interconnects?82574 gbe controller 3.1.3.9 error forwarding if a transaction layer protocol (tlp) is received with an error-forwarding trailer, the packet is dropped and not delivered to its destination. the 82574 does not initiate any additional master requests for that pci func tion until it detects an internal reset or software. software is able to access device registers after such a fault. system logic is expected to trigger a syst em-level interrupt to inform the operating system of the problem. the operating syst em can then stop the process associated with the transaction, re-allocate memory instead of the faulty area, etc. 3.1.3.10 master disable system software can disable master accesses on the pcie link by either clearing the pci bus master bit or by bringing the function into a d3 state. from that time on, the 82574 must not issue master accesses for this function. due to the full-duplex nature of pcie, and the pipelined design in the 82574 , it might happen that multiple requests from several functions are pending when the master disable request arrives. the protocol described in this section insures that a function does not issue master requests to the pcie link after its master enable bit is cleared (or after entry to d3 state). two configuration bits are provided for the handshake between the device function and its driver: ? pcie master disable bit in the device control (ctrl) register - when the pcie master disable bit is set, the 82574 blocks new master requests, including manageability requests. the 82574 then proc eeds to issue any pending requests by this function. this bit is cleared on master reset (internal power on reset all the way to a software reset) to enable master accesses. ? pcie master enable status bits in the device status register - cleared by the 82574 when the pcie master disable bit is set and no master requests are pending by the relevant function, set otherwise. software note: ? the software device driver sets the pcie master disable bit when notified of a pending master disable (or d3 entry). the 82574 then blocks new requests and proceeds to issue any pending requests by this function. the software device driver then polls the pcie master enable status bit. once the bit is cleared, it is guaranteed that no requests are pend ing from this function. the software device driver might time out if the pcie master enable status bit is not cleared within a given time. ?the pcie master disable bit must be cleared to enable a master request to the pcie link. this can be done either th rough reset or by the software device driver. 3.1.4 flow control 3.1.4.1 flow control rules the 82574 only implements the default virtual channel (vc0). a single set of credits is maintained for vc0.
82574 gbe controller?interconnects 34 table 20. allocation of fc credits rules for fc updates: ? the 82574 maintains two credits for npd at any given time. it increments the credit by one after the credit is consumed and sends an updatefc packet as soon as possible. updatefc packets are sched uled immediately after a resource is available. ? the 82574 provides two credits for ph (such as for two concurrent target writes) and two credits for nph (such as for two concurrent target reads). updatefc packets are scheduled immediately after a resource becomes available. ? the 82574 follows the pcie recommendat ions for frequency of updatefc fcps. 3.1.4.2 upstream flow control tracking the 82574 issues a master transaction only wh en the required fc credits are available. credits are tracked for posted, non-posted, and completions (the later to operate against a switch). 3.1.4.3 flow control update frequency in any case, updatefc packets are scheduled immediately after a resource becomes available. when the link is in the l0 or l0s link state, update fcps for each enabled type of non- infinite fc credit must be scheduled for transmission at least once every 30 s (-0%/ +50%), except when the extended sync bit of the control link register is set, in which case the limit is 120 s (-0%/+50%). 3.1.4.4 flow control timeout mechanism the 82574 implements the optional fc upda te timeout mechanism. the mechanism is activated when the link is in l0 or l0s link state. it uses a timer with a limit of 200 s (- 0%/+50%), where the timer is reset by the receipt of any init or update fcp. alternately, the timer can be reset by the receipt of any dllp. after timer expiration, the mechanism instructs the phy to retrain the link (via the ltssm recovery state). credit type operations number of credits posted request header (ph) target write (1 unit) message (1 unit) 2 units posted request data (pd) target write (length/16b=1) message (1 unit) 16 credits (for 256 bytes) non-posted request header (nph) target read (1 unit) configuration read (1 unit) configuration write (1 unit) 2 units non-posted request data (npd) configuration write (1 unit) 2 units completion header (cplh) read completi on (n/a) infinite (accepted immediately) completion data (cpld) read completion (n/a) infinite (accepted immediately)
35 interconnects?82574 gbe controller 3.1.5 host i/f 3.1.5.1 tag ids pcie device numbers identify logical devices within the physical device (the 82574 is a physical device). the 82574 implements a single logical device with one pci function - lan. the device number is captured from each type 0 configuration write transaction. each of the pcie functions interface with the pcie unit through one or more clients. a client id identifies the client and is included in the tag field of the pcie packet header. completions always carry the tag value included in the request to enable routing of the completion to the appropriate client. client ids are assigned as follows: table 21. assignment of client ids tag code in hex flow: tlp type ? usage 00 rx: wr req (data from ethernet to main memory) 01 rx: rd req to read descriptor to core 02 rx: wr req to write back descriptor from core to memory 04 tx: rd req to read descriptor to core 05 tx: wr req to write back descriptor from core to memory 06 tx: rd req to read descriptor to core second queue 07 tx: wr req to write back descriptor from core to memory (second queue) 08 tx: rd req data 0 from main memory to ethernet 09 tx: rd req data 1 from main memory to ethernet 0a tx: rd req data 2 from main memory to ethernet 0b tx: rd req data 3 from main memory to ethernet 0c rx: rd req to bring descriptor to core second queue 0e rx: wr req to write back descriptor from core to memory (second queue) 10 mng: rd req: read data 11 mng: wr req: write data 1e msi and msi-x 1f message unit others reserved
82574 gbe controller?interconnects 36 3.1.5.1.1 completion timeout mechanism in any split transaction protocol, there is a risk associated with the failure of a requester to receive an expected completion. to enable requesters to attempt recovery from this situation in a standard manner, the completion timeout mechanism is defined. ? the completion timeout mechanism is acti vated for each request that requires one or more completions when the request is transmitted. ? the completion timeout timer should not expire in less than 10 ms. ? the completion timeout timer must expire if a request is not completed in 50 ms. ? a completion timeout is a reported erro r associated with the requestor device/ function. a memory read request for which there are multiple completions are considered completed only when all completions are receiv ed by the requester. if some, but not all, requested data is returned before the comp letion timeout timer expires, the requestor is permitted to keep or discard the data th at was returned prior to timer expiration. 3.1.5.1.2 out of order completion handling in a split transaction protocol, when using multiple read requests in a multi processor environment, there is a risk that the comp letions might arrive from the host memory out of order and interleave. in this case the host interface role is to sort the request completions and transfer them to the ethernet core in the correct order. 3.1.6 error events and error reporting 3.1.6.1 mechanism in general pcie defines two error reporting paradigms: the baseline capability and the advanced error reporting (aer) capability. the baseline error reporting capabilities are required of all pcie devices and define the mini mum error reporting requirements. the aer capability is defined for more robust error re porting and is implemented with a specific pcie capability structure. both mechanisms are supported by the 82574. also the serr# enable and the parity error bits from the legacy command register take part in the error reporting and logging mechanism. figure 5 shows, in detail, the flow of error reporting in the 82574.
37 interconnects?82574 gbe controller figure 5. error reporting flow 3.1.6.1.1 error events ta b l e 2 2 lists error events identified by the 8257 4 and the response in terms of logging, reporting, and actions taken. consult the pc ie specification for the affect on the pci status register. table 22. response and repo rting of error events command :: serr# enable command :: parity error response status :: signaled target abort status :: received target abort status :: received master abort status :: detected parity error device control :: correctable error reporting enable device control :: non-fatal error reporting enable device control :: fatal error reporting enable device control :: unsupported request reporting enable device status :: correctable error detected device status :: non-fatal error detected device status :: fatal error detected device status :: unsupported request detected uncorrectable error severity uncorrectable error mask correctable error mask uncorrectable error status correctable error status status reporting - not gated secondary status :: detected parity error secondary status :: signaled target abort secondary status :: received system error (either implementation acceptable - the unqualified version is more like pci p2p bridge spec) secondary status :: received target abort secondary status :: received master abort secondary status :: master data parity error bridge control :: serr enable bridge control :: parity error response enable root control :: system error on correctable error enable root control :: system error on non-fatal error enable root control :: system error on fatal error enable root error command :: correctable error reporting enable root error command :: non-fatal error reporting enable root error command :: fatal error reporting enable root error status rcv msg system error interrupt status :: signaled system error secondary side error sources error sources (associated with port) error message processing status :: master data parity error error name error events default severity action phy errors receiver error ? 8b/10b decode errors ? packet framing error correctable send err_corr tlp to initiate nak, drop data dllp to drop data link errors bad tlp ? bad crc ? not legal edb ? wrong sequence number correctable send err_corr tlp to initiate nak, drop data bad dllp bad crc correctable send err_corr dllp to drop replay timer timeout replay_timer expiration correctable send err_corr follow ll rules replay num rollover replay num rollover correctable send err_corr follow ll rules
82574 gbe controller?interconnects 38 data link layer protocol error violations of flow control initialization protocol uncorrectable send err_fatal tlp errors poisoned tlp received tlp with error forwarding uncorrectable err_nonfatal log header in case of poisoned completion, no more requests from this client. unsupported request (ur) ? wrong config access ?mrdlk ? config request type1 ? unsupported vendor defined type 0 message ?not valid msg code ? not supported tlp type ? wrong function number ?wrong tc/vc ? received target access with data size > 64-bit ? received tlp outside address range uncorrectable err_nonfatal log header send completion with ur completion timeout completion timeout timer expired uncorrectable err_nonfatal send the read request again completer abort attempts to write to the flash device when writes are disabled (fwe=10b) uncorrectable err_nonfatal log header send completion with ca unexpected completion received completion without a request for it (tag, id, etc.) uncorrectable err_nonfatal log header discard tlp receiver overflow received tlp beyond allocated credits uncorrectable err_fatal receiver behavior is undefined flow control protocol error ? minimum initial flow control advertisements ? flow control update for infinite credit advertisement uncorrectable err_fatal receiver behavior is undefined malformed tlp (mp) ? data payload exceed max_payload_size ?received tlp data size does not match length field ? td field value does not correspond with the observed size ? byte enables violations. ? pm messages that don?t use tc0. ? usage of unsupported vc uncorrectable err_fatal log header drop the packet, free fc credits completion with unsuccessful completion status no action (already done by originator of completion) free fc credits error name error events default severity action
39 interconnects?82574 gbe controller 3.1.6.1.2 error pollution error pollution can occur if error conditions for a given transaction are not isolated to the error's first occurrence. if the phy dete cts and reports a receiver error, to avoid having this error propagate and cause subs equent errors at upper layers, the same packet is not signaled at the data link or transaction layers. similarly, when the data link layer detects an error, subsequent errors that occur for the same packet is not signaled at the transaction layer. 3.1.6.1.3 completion with un successful completion status a completion with unsuccessful completion status is dropped and not delivered to its destination. the request that corresponds to the unsuccessful completion is retried by sending a new request for the undeliverable data. 3.1.7 link layer 3.1.7.1 ack/nak scheme the 82574 supports two alternative schemes for ack/nak rate: 1. ack/nak is scheduled for transmission following any tlp. 2. ack/nak is scheduled for transmission acco rding to timeouts specified in the pcie specification. the pcie error recovery bit, loaded from nvm, determin es which of the two schemes is used. 3.1.7.2 supported dllps the following dllps are supported by the 82574 as a receiver: table 23. dllps received by the 82574 the following dllps are supported by the 82574 as a transmitter: remarks remarks ack nak pm_request_ack initfc1-p v2v1v0 = 000 initfc1-np v2v1v0 = 000 initfc1-cpl v2v1v0 = 000 initfc2-p v2v1v0 = 000 initfc2-np v2v1v0 = 000 initfc2-cpl v2v1v0 = 000 updatefc-p v2v1v0 = 000 updatefc-np v2v1v0 = 000 updatefc-cpl v2v1v0 = 000
82574 gbe controller?interconnects 40 table 24. dllps initia ted by the 82574 3.1.7.3 transmit edb nullifying in case of a retrain necessity, there is a need to guarantee that no abrupt termination of the tx packet happens. for this reason, early termination of the transmitted packet is possible. this is done by appending the edb to the packet. 3.1.8 phy 3.1.8.1 link width the 82574 supports a link width of x1 only. 3.1.8.2 polarity inversion if polarity inversion is detected, the receiver must invert the received data. during the training sequence, the receiver looks at symbols 6-15 of ts1 and ts2 as the indicator of lane polarity inversion (d+ and d- are swapped). if lane polarity inversion occurs, the ts1 symbols 6-15 received are d21.5 as opposed to the expected d10.2. similarly, if lane polarity inversion occurs, symbols 6-15 of the ts2 ordered set are d26.5 as opposed to the expected 5d5.2. th is provides the clear indication of lane polarity inversion. 3.1.8.3 l0s exit latency the number of fts sequences (n_fts), sent during l1 exit, is loaded from the nvm into an 8-bit read-only register. remarks 1 1. updatefc-cpl is not sent because of the infinite fc-cpl allocation. remarks ack nak pm_enter_l1 pm_enter_l23 pm_active_state_request_l1 initfc1-p v2v1v0 = 000 initfc1-np v2v1v0 = 000 initfc1-cpl v2v1v0 = 000 initfc2-p v2v1v0 = 000 initfc2-np v2v1v0 = 000 initfc2-cpl v2v1v0 = 000 updatefc-p v2v1v0 = 000 updatefc-np v2v1v0 = 000
41 interconnects?82574 gbe controller 3.1.8.4 reset the pcie phy can initiate core reset to the 82574. the reset can be caused by three sources: ? upstream move to hot reset - inband mechanism (ltssm). ? recovery failure (ltssm returns to detect). ? upstream component move to disable. 3.1.8.5 scrambler disable the scrambler/de-scrambler functionality in the 82574 can be eliminated by two mechanisms: ? upstream according to the pcie specification. ?nvm bit. 3.1.9 performance monitoring the 82574 incorporates pcie performance monitoring counters to provide common capabilities to evaluate performance. the 82574 implements four 32-bit counters to correlate between concurrent measurements of events as well as the sample delay and interval timers. the four 32-bit counters can also operate in a two 64-bit mode to count long intervals or payloads. the list of events supported by the 82574 and the counters control bits are described in the memory register map. 3.2 ethernet interface the 82574 mac provides a complete csma/cd function, supporting ieee 802.3 (10 mb/s), 802.3u (100 mb/s), 802.3z, and 802.3ab (1000 mb/s) implementations. the 82574 performs all of the functions required for transmission, reception, and collision handling called out in the standards. the gmii/mii mode used to communicate between the mac and the phy supports 10/100/1000 mb/s operation, with both half- and full-duplex operation at 10/100 mb/s, and only full-duplex operation at 1000 mb/s. note: the 82574 mac is optimized for full-duplex operation in 1000 mb/s mode. half-duplex 1000 mb/s operation is not supported. the phy features 10/100/1000-baset signaling and is capable of performing intelligent power-management based on both the syst em power-state and lan energy-detection (detection of unplugged cables). power management includes the ability to shutdown to an extremely low (powered-down) state wh en not needed as well as ability to auto- negotiate to a lower-speed 10/100 mb/s operation when the system is in low power- states. 3.2.1 mac/phy gmii/mii interface the 82574 mac and phy communicate through an internal gmii/mii interface that can be configured for either 1000 mb/s operation (gmii) or 10/100 mb/s (mii) mode of operation. for proper network operation, both the mac and phy must be properly configured (either explicitly via software or via hardware auto-negotiation) to identical speed and duplex settings. all mac configur ation is performed using device control registers mapped into system memory or i/ o space; an internal mdio/mdc interface, accessible via software, is used to configure the phy operation.
82574 gbe controller?interconnects 42 the internal gigabit media independent interfac e (gmii) mode of operation is similar to mii mode of operation. gmii mode uses the same mdio/mdc management interface and registers for phy configuration as mii mode. these common elements of operation enable the 82574 mac and phy to cooperatively determine a link partner's operational capability and configure the hardware based on those capabilities. 3.2.1.1 mdio/mdc the 82574 implements an internal ieee 802.3 mii management interface (also known as the management data input/output or mdio interface) betw een the mac and phy. this interface provides the mac and softwa re the ability to monitor and control the state of the phy. the internal mdio interf ace defines a physical connection, a special protocol that runs across the connection, and an internal set of addressable registers. the internal interface consists of a data line (mdio) and clock line (mdc), which are accessible by software via the mac register space. software can use mdio accesses to read or write registers in either gmii or mii mode by accessing the 82574's mdic register (see section 10.2.2.7 ). 3.2.1.2 other mac/phy control and status in addition to the internal gmii/mii comm unication and mdio interface between the mac and the phy, the 82574 implements a ha ndful of additional internal signals between mac and phy, which provide richer control and features. ? phy reset - the mac provides an internal re set to the phy. this signal combines the pci_rst_n input from the pci bus and the phy reset bit of the device control register (ctrl.phy_rst). ? phy link status indication - the phy provides a direct internal indication of link status (link) to the mac to indicate whether it has sensed a valid link partner. unless the phy has been configured via its mii management registers to assert this indication unconditionally, this signal is a valid indication of whether a link is present. the mac relies on this in ternal indication to reflect the status.lu status as well as to initiate actions such as generating interrupts on link status changes, re-initiating link speed sense, etc. ? phy duplex indication - the phy provides a direct internal indication to the mac of its resolved duplex mode (fdx). normally , auto-negotiation by the phy enables the phy to resolve full/duplex communications with the link partner (except when the phy is forced through mii register settings). the mac normally uses this signal after a link loss/restore to ensure that the mac is configured consistently with the re-linked phy settings. this indication is effectively visible through the mac register bit status.fd , each time mac speed has not been forced. ? phy speed indication(s) - the phy provides direct internal indications (spd_ind) to the mac of its negotiated speed (10/100/1000 mb/s). the result of this indication is effectively visible through the mac register bits status.speed each time mac speed has not been forced. ? mac dx power state indication - the mac indicates its acpi power state (pwr_state) to the phy to enable it to perform intelligent power-management (provided that the phy power-management is enabled in the mac ctrl register). 3.2.2 duplex operation for copper phy/gmii/mii operation the 82574 supports half-duplex and full-duplex 10/100 mb/s mii mode or 1000 mb/s gmii mode. configuring the duplex operation of the 82574 can either be forced or determined via the auto-negotiation process. see section 3.2.3 for details on link configuration setup and resolution.
43 interconnects?82574 gbe controller 3.2.2.1 full duplex all aspects of the ieee 802.3, 802.3u, 802.3z, and 802.3ab specifications are supported in full duplex op eration. full duplex operation is enabled by several mechanisms, depending on the speed conf iguration of the 82574 and the specific capabilities of the link partner used in the ap plication. during full duplex operation, the 82574 might transmit and receive packets simultaneously across the link interface. in full-duplex gmii/mii mode, transmission and reception are delineated independently by the gmii/mii control signals. transmission starts at the assertion of tx_en, which indicates there is valid data on the tx_data bus driven from the mac to the phy. reception is signaled by the phy by the assertion of the rx_dv signal, which indicates valid receive data on the rx_data lines to the mac. 3.2.2.2 half duplex the 82574 mac can operate in half duplex. in half duplex operation, the mac attempts to avoid contention with other traffic on the link by monitoring the crs signal provided by the phy and deferring to passing traffic. when the crs signal is de-asserted or after a sufficient inter-packet gap (ipg) has elapsed after a transmission, frame transmi ssion begins. the mac signals the phy with tx_en at the start of transmission. if a collision occurs, the phy detects the co llision and asserts the col signal to the mac. transmitting the frame stops within four link clock times and the 82574 sends a jam sequence onto the link. after the end of a collided transmission, the 82574 backs off and attempts to re-transmit per the standard csma/cd method. note: the re-transmissions are done from the data stored internally in the 82574 mac transmit packet buffer (no re-access to the data in host memory is performed). after a successful transmission, the 82574 is ready to transmit any other frame(s) queued in the mac's transmit fifo, after the minimum inter-frame spacing (ifs) of the link has elapsed. during transmit, the phy is expected to si gnal a carrier-sense (assert the crs signal) back to the mac before one slot time has elapsed. the transmission completes successfully even if the phy fails to indicate crs within the slot time window; if this situation occurs, the phy can either be configured incorrectly or be in a link down situation. such an event is counted in the transmit without crs statistic register (see section 10.2.7.11 ). 3.2.3 auto-negotiation & link setup features the method for configuring the link between two link partners is highly dependent on the mode of operation. configuration of the link can be accomplished by several methods ranging from: ? software's forcing link settings ? software-controlled negotiation ? mac-controlled auto-negotiation ? auto-negotiation initiated by a phy. the following sections describe processes of bringing the link up including configuration of the 82574 and the transceiver, as well as the various methods of determining duplex and speed configuration.
82574 gbe controller?interconnects 44 the phy performs auto-negotiation per 802.3a b clause 40 and extensions to clause 28. link resolution is obtained by the mac from the phy after the link has been established. the mac accomplishes this via the mdio interf ace, via specific signals from the phy to the mac, or by mac auto-detection functions. 3.2.3.1 link configuration link configuration is generally determined by phy auto-negotiation. the software device driver must intervene in cases where a successful link is not negotiated or a user desires to manually configure the link. the following sections discuss the methods of link configuration for copper phy operation. 3.2.3.1.1 phy auto-negotiation (speed, duplex, flow-control) the phy performs the auto-negotiation func tion. the details of this operation are described in the ieee p802.3ab draft standard and are not included here. auto-negotiation provides a method for two link partners to exchange information in a systematic manner in order to establish a link configuration providing the highest common level of functionality supported by both partners. once configured, the link partners exchange configuration information to resolve link settings such as: ? speed: 10/100/1000 mb/s ? duplex: full or half ? flow control operation phy specific information re quired for establishing the link is also exchanged. note: if flow control is enabled in the 82574, the settings for the desired flow control behavior must be set by software in the ph y registers and auto-negotiation restarted. after auto-negotiation completes, the soft ware device driver must read the phy registers to determine the resolved flow control behavior of the link and reflect these in the mac register settings (ctrl.tfce and ctrl .rfce). if no software device driver is loaded and auto-negotiation is enabled, then hardware sets these bits in accordance with the auto-negotiation results. note: by default, the phy advertises flow control support. since the management path does not support flow control, it should change this default. therefore, when management is active and there is no software device driver loaded, it should disable the flow control support and restart auto-negotiation. note: once phy auto-negotiation completes, the ph y asserts a link indication (link) to the mac. software must set the set link up bit in the device cont rol register (ctrl.slu) before the mac recognizes the link indication from the phy and can consider the link to be up.
45 interconnects?82574 gbe controller 3.2.3.1.2 mac speed resolution for proper link operation, both the mac an d phy must be config ured for the same speed of link operation. the speed of the link can be determined and set by several methods with the 82574. these include: ? software-forced configuration of the mac speed setting based on phy indications, which can be determined as follows: ? software reads of phy registers directly to determine the phy's auto-negotiated speed ? software reads the phy's internal ph y-to-mac speed indi cation (spd_ind) using the mac status.speed register ? software signals the mac to attempt to auto-detect the phy speed from the phy-to-mac rx_clk, then programs the mac speed accordingly ? the mac automatically detecting and setting the link speed of the mac based on phy indications by: ? using the phy's internal phy-to-mac sp eed indication (spd_ind), setting the mac speed automatically ? attempting to auto-detect the phy sp eed from the phy-to-mac rx_clk and setting the mac speed automatically aspects of these methods are discussed in the sections that follow. 3.2.3.1.2.1 forcing mac speed there might be circumstances when the software device driver must forcibly set the link speed of the mac. this can occur when the link is manually configured. to force the mac speed, the software device driver must set the ctrl.frcspd (force-speed) bit to 1b and then write the speed bits in the de vice control register (ctrl.speed) to the desired speed setting. see section 10.2.2.1 for details. note: forcing the mac speed using ctrl.frcsp d overrides all other mechanisms for configuring the mac speed and can yield non- functional links if the mac and phy are not operating at the same speed/configuration. when forcing the 82574 to a specific speed co nfiguration, the software device driver must also ensure the phy is configured to a speed setting consistent with mac speed settings. this implies that software must access the phy registers to either force the phy speed or to read the phy status register bits that indicate link speed of the phy. note: forcing speed settings by ctrl.speed can also be accomplished by setting the ctrl_ext.spd_byps bit. this bit bypasses the mac's internal clock switching logic and enables the software device driver complete control of when the speed setting takes place. the ctrl.frcspd bit uses the mac's internal clock switching logic, which does delay the affect of the speed change. 3.2.3.1.2.2 using phy direct link-speed indication the 82574 phy provides a direct internal indi cation of its speed to the mac (spd_ind). the most direct method for determining the phy link spee d and either manually or automatically configuring the mac speed is based on these direct speed indications. for mac speed to be set/determined from th ese direct internal indications from the phy, the mac must be configured such th at ctrl.asde and ctrl .frcspd are both 0b (both auto-speed detection and forced-speed override are disabled). as a result, the mac speed is reconfigured automatically each time the phy indicates a new link-up event to the mac.
82574 gbe controller?interconnects 46 when mac speed is neither forced nor auto-sensed by the mac, the current mac speed setting and the speed indicated by the phy is re flected in the device status register bits status.speed. 3.2.3.1.3 mac full/half duplex resolution the duplex configuration of the link is al so resolved by the phy during the auto- negotiation process. the 82574 phy provides an internal indication to the mac of the resolved duplex configuration using an internal full-duplex indication (fdx). this internal duplex indication is normally sampled by the mac each time the phy indicates the establishment of a good link (l ink indication). the phy's indicated duplex configuration is applied in the mac and refl ected in the mac device status register (status.fd). software can override the duplex setting of the mac via the ctrl.fd bit when the ctrl.frcdplx (force duplex) bit is set. if ctrl.frcdplx is 0b, the ctrl.fd bit is ignored and the phy's internal duplex indication applied. 3.2.3.1.4 using phy registers the software device driver might be required under some circumstances to read from or write to the mii management registers in the phy. these accesses are performed via the mdic registers (see section 10.2.2.7 ). the mii registers enable the software device driver to have direct control over the phy's operation, which might include: ? resetting the phy ? setting preferred link configuration for ad vertisement during the auto-negotiation process ? restarting the auto-negotiation process ? reading auto-negotiation status from the phy ? forcing the phy to a specific link configuration the set of phy management registers required for all phy devices can be found in the ieee p802.3ab draft standard. the regist ers for the 82574 phy are described in section 10.2 . 3.2.3.1.5 comments regarding forcing link forcing link requires the software device driver to configure both the mac and phy in a consistent manner with respect to each othe r. after initialization, the software device driver configures the desired modes in the mac, then accesses the phy registers to set the phy to the same configuration. before enabling the link, the speed and dupl ex settings of the mac can be forced by software using the ctrl.frcspd , ctrl.frcdpx , ctrl.speed , and ctrl.fd bits. after the phy and mac have both been configured, the software device driver should write a 1b to the ctrl.slu bit. 3.2.4 loss of signal/link status indication phy los/link signal provides an indication of physical link status to the mac. this signal from the phy indicates whether the lin k is up or down; typically indicated after successful auto-negotiation. assuming that the mac is configured with ctrl.slu = 1b, the mac status bit status.lu when read, generally reflects whether the phy has link (except under forced-link setup where even the phy link indication might have been forced).
47 interconnects?82574 gbe controller when the link indication from the phy is de-asserted, the mac considers this to be a transition to a link-down situation (such as, ca ble unplugged, loss of link partner, etc.). if the lsc (link status change) interrupt is enabled, the mac generates an interrupt to be serviced by the software device driver. see section 7.4 and section 10.2.4 for more details. 3.2.5 10/100 mb/s specific performance enhancements 3.2.5.1 adaptive ifs the 82574 supports back-to-back transmit inter-frame-spacing (ifs) of 960 ns in 100 mb/s operation and 9.6 ? s in 10 mb/s operation. although back-to-back transmission is normally desirable, sometimes it can actually hurt performance in half-duplex environments due to excessive collisions. excessive collisions are likely to occur in environments where one station is attempting to send large frames back-to-back, while another station is attempting to send acknowledge (ack) packets. the 82574 contains an adaptive ifs register (see section 10.2.6.3 ) that enables the implementation of a driver-based adaptive ifs algorithm for collision reduction, which is similar to intel's other ethernet products (such as pro/100 adapters). adaptive ifs throttles back-to-back transmissions in the transmit mac and delays their transfer to the csma/cd transmit function and then ca n be used to delay the transmission of back-to-back packets on the wire. normally, this register should be set to zero. however, if additional delay is desired be tween back-to-back transmits, then this register can be set with a value greater than zero. this can be helpful in high-collision half-duplex environments. the aifs field provides a similar function to the igpt field in the tipg register (see section 10.2.6.3 ). however, this adaptive ifs throttle register counts in units of gtx/ mtx_clk clocks, which are 800 ns, 80 ns, 8 ns for 10/100/1000 mb/s mode respectively, and is 16 bits wide, thus providing a greater maximum delay value. using values lower than a certain minimum (determined by the ratio of gtx/mtx_clk clock to link speed), has no effect on back-to-back transmission. this is because the 82574 does not start transmission until the minimum ieee ifs (9.6 ? s at 10 mb/s, 960 ns at 100 mb/s, and 96 ns at 1000 mb/s) has been met regardless of the value of adaptive ifs. for example, if the 82574 is configured for 100 mb/s operation, the minimum ieee ifs at 100 mb/s is 960 ns. setting aifs to a value of 10 (decimal) would not effect back-to-back transmission time on the wire because the 800 ns delay introduced (10 * 80 ns = 800 ns) is less than the minimum ieee ifs delay of 960 ns. however, setting this register with a value of 20 (decimal), which corresponds to 1600 ns for the above example, would de lay back-to-back transmits because the ensuing 1600 ns delay is greater than the minimum ifs time of 960 ns. it is important to note that this register has no effect on transmissions that occur immediately after receives or on transmissions that are not back-to-back (unlike the ipgr1 and ipgr2 values in the tipg register (see section 10.2.6.2 ). in addition, adaptive ifs also has no effect on re-transmission timing (re-transmissions occur after collisions). therefore, aifs is only enabled in back-to-back transmission. note: the aifs value is not additive to the tipg.i pgt value; instead, the actual ipg equals the larger of the two, aifs and tipg.ipgt.
82574 gbe controller?interconnects 48 3.2.6 flow control flow control as defined in 802.3x, as well as the specific operation of asymmetrical flow control defined by 802.3z, are supported in the mac. the following seven registers are defined for the implementation of flow control: ? flow control address low (fcal) - 6-byte flow control multicast address ? flow control address high (fcah) - 6- byte flow control multicast address ? flow control type (fct) - 16-bit field that indicates flow control type ? flow control receive thresh hi (fcrth) - 13-bit high-water mark indicating receive buffer fullness ? flow control receive thresh lo (fcrtl) - 13-bit low-water mark indicating receive buffer emptiness ? flow control transmit timer value (fcttv) - 16-bit timer value to include in transmitted pause frames ? flow control refresh threshold value (fcrtv) - 16-bit pause refresh threshold value flow control allows for local controlling of network congestion levels. flow control is implemented as a means of reducing the possi bility of receive buffe r overflows. receive buffer overflows result in the dropping of received packets. flow control is accomplished by notifying the transmitting station that the receiving station receive buffer is nearly full. implementing asymmetric flow control allows for one link partner to send flow control packets while being allowed to ignore their reception. for example, not required to respond to pause frames. 3.2.6.1 mac control frames and re ception of flow control packets three comparisons are used to determine the validity of a flow control frame. all three must be true for a positive result. 1. a match on the six-byte multicast address for mac control frames or to the station address of the device (receive address register 0). 2. a match on the type field. 3. a comparison of the mac control opcode field. the 802.3x standard defines the mac contro l frame multicast address as 01-80-c2-00- 00-01. this address must be loaded into the flow control address low/high registers (fcal/fcah). the flow control type (fct) register contains a 16-bit field that is compared against the flow control packet's type field to determine if it is a valid flow control packet: xon or xoff. 802.3x reserves this as 0x8808. this value must be loaded into the flow control type register. the final check for a valid pause frame is the mac control opcode. at this time, only the pause control frame opcode is defined. it has a value of 0x0001. frame-based flow control differentiates xoff from xon based on the value of the pause timer field. non-zero values constitute xoff frames while a value of zero constitutes an xon frame. values in the timer field are in units of slot time. a slot time is hard wired to 64-byte times or 512 ns. note: an xon frame signals the cancellation of th e pause from being initiated by an xoff frame (pause for zero slot times).
49 interconnects?82574 gbe controller figure 6. 802.3x mac control frame format where s is the start-of-packet delimiter and t is the first part of the end-of-packet delimiters for 802.3z encapsulation. the receiver is enabled to receive flow control frames if flow control is enabled via the rfce bit in the device control (ctrl) register. note: flow control capability must be negotiated between link partners via the auto- negotiation process. the auto-negotiation process might modify the value of these bits based on the resolved capability between the local device and the link partner. once the receiver validates receiving an xo ff or pause frame, the 82574 performs the following: ? increments the appropriate statistics register(s). ? sets the txoff bit in the device status (status) register. ? initializes the pause timer based on the packet's pause timer field. ? disables packet transmission or schedules the disabling of transmissions after the current packet completes. resuming transmission can occur under the following conditions: ? an expired pause timer ? receiving an xon frame (a frame with its pause timer set to zero) either condition clears the txoff status bit in the device status register and transmission can resume. note that hard ware records the number of received xon frames. (min_framesize -160)/8 bytes preamble... sfd s fcs t up to 6 bytes 1 byte 1 byte destination address 6 bytes source address 6 bytes type/length 2 bytes mac control opcode 2 bytes mac control parameters 1 byte 4 bytes
82574 gbe controller?interconnects 50 3.2.6.2 discard pause frames and pass mac control frames two bits in the receive control register are implemented specifically for control over receipt of pause and mac control frames. th ese bits are discard pause frames (dpf) and pass mac control frames (pmcf). see section 10.2.6.2 for dpf and pmcf bit definitions. the dpf bit forces the discarding of any valid pause frame addressed to the 82574's station address. if the packet is a valid pause frame and is addressed to the station address (receive address [0]), the 82574 does not pass the packet to host memory if the dpf bit is set to logic high. however, if a flow control packet is sent to the station address and is a valid flow control fram e, it is then be transferred when dpf is set to 0b. this bit has no affect on pause operation, only the dma function. the pmcf bit enables for the passing of any valid mac control frames to the system, which does not have a valid pause opcode. in other words, the frame must have the correct mac control frame multicast address (or the mac station address) as well as the correct type field match with the fct register, but does not have the defined pause opcode of 0x0001. frames of this type are transferred to host memory when pmcf is a logic high. 3.2.6.3 transmitting pause frames transmitting pause frames is enabled by software by writing a 1b to the tfce bit in the device control register. note: similar to receiving flow control packets, xoff packets can be transmitted only if this configuration has been negotiated between the link partners via the auto-negotiation process. in other words, setting this bit in dicates the desired configuration. resolving the auto-negotiation process is described in section 3.2.3 . the content of the flow control receive threshold high register determines at what point hardware transmits a pause frame. hard ware monitors the fullness of the receive fifo and compares it with the contents of fcrth. when the threshold is reached, hardware sends a pause frame with its pause time field equal to fcttv. at the time threshold is reached, the hard ware starts counting an internal shadow counter fcrtv (reflecting the pause time-out counter at the partner end) from zero. when the counter reaches the value indicated in the fcrtv register, then, if the pause condition is still valid (meaning that the buffer fullness is still above the low watermark), an xoff message is sent again and the shadow counter starts counting again. once the receive buffer fullness reaches th e low water mark, hardware sends an xon message (a pause frame with a timer value of zero). software enables this capability with the xone field of the fcrtl. hardware sends one more pause frame if it has previously sent one and the fifo overflows (so the threshold must not be se t greater than the fifo size). this is intended to minimize the amount of packets dropped if the first pause frame does not reach its target. since the secure receive packets use the same data path, the behavior is identical when secure packets are received. note: transmitting flow control frames should on ly be enabled in full-duplex mode per the ieee 802.3 standard. software should ensure that transmitting flow control packets is disabled when the 82574 is operating in half-duplex mode. note: regardless of the mechanism above, each time a receive packet is dropped due to lack of space in the internal receive buffer, a pause frame is transmitted as well (if tfce bit in the device control register is enabled).
51 interconnects?82574 gbe controller 3.2.6.4 software initiated pause frame transmission the 82574 has the added capability to transmit an xoff frame via software. this is accomplished by software writing a 1b to the swxoff bit of the transmit control register. once this bit is set, hardware initiates transmitting a pause frame in a manner similar to that automatically generated by hardware. the swxoff bit is self-clearing after the pause frame has been transmitted. the state of the ctrl.tfce bit or the negotiated flow control configuration does not affect software generated pause frame transmission. note: software sends an xon frame by programming a zero in the pause timer field of the fcttv register. note: xoff transmission is not supported in 802.3x for half-duplex links. software should not initiate an xoff or xon transmission if the 82574 is configured for half-duplex operation. 3.3 spi non-volatile memory interface 3.3.1 general overview the 82574 requires non-volatile content for the 82574 configuration. the non-volatile memory (nvm) might contain the following main regions: ? lan configuration space accessed by hardware - loaded by the 82574 after power up, pci reset de-assertion, d3->d0 transition, or a software commanded eeprom read (ctrl_ext.ee_rst). ? lan configuration space accessed by software - used by software only. the meaning of these registers as listed here as a convention for the software only and is ignored by the 82574. 3.3.2 supported nvm devices previous gbe controllers required both eeprom and flash to store data. the 82574 reduces bill of material (bom) cost by conso lidating the flash and eeprom into a single nvm. the nvm is connected to a single spi interface. eeprom: the 82574 is compatible with many sizes of 4-wire spi eeprom devices. the recommended eeproms for the 82574 are: ? 1 kb: stm* 95010w6, catalyst* cat25010s, or atmel* at25010n ? 2 kb: stm 95020w6, catalyst cat25020s, or atmel at25020n ? 32 kb: stm 95320w6, catalyst cat25c320s, or atmel at25320n typically, the eeprom size should be 32 kb for supporting manageability, smbus pass through, and network controller-sideband inte rface (nc-si) over rmii. at 1 kb or 2 kb eeprom sizes, manageability is not supported.
82574 gbe controller?interconnects 52 flash: the size of the flash is selected by the system integrator according to its usage. the 82574 supports a maximum size of 16 mb devices, which is beyond any requirements. the typical flash size for many applications of the 82574 is 4 mb. at any size, the 82574 has the following requirements from the flash: block erase instruction of 4 kb and the flash should support the re ad device id instruction that enables the software to identify an empty device type. the 82574 drives the flash at a frequency of ~15.6 mhz. the following flash devices are recommended for use with the 82574: sst* 25vf0 ! 0, pmc* pm25lv0x0, winbond* w25x ! 0 or atmel at25fs0 ! 0 1 while ! stands for flash sizes of 64 kb up to 2 mb. ta b l e 2 5 lists the existing flash devices and their major characteristics: table 25. flash devices - major characteristics 3.3.3 nvm device detection the 82574 detects the device connected on the spi interface in two phases. 1. it first detects the device type by the state of the nvmt strapping pin. 2. it then looks at the nvm content depending on a valid signature in word 0x12 in the nvm. in reference to the eeprom, the 82574 dete cts the length of the address bytes by sensing the signature at word 0x12. it then sets the nvadds field in the eec register. the exact size of the nvm is fetched by the 82574 from word 0x0f and is stored in the nvsize field in the eec register. when operatin g with an eeprom that has an invalid signature, software can force the address length via the nvadds field in the eec register. controlling the address length enables software to access the eeprom via the parallel eerd and eewr registers in all cases including invalid signature. 1. for sst and pmc devices, flash auto detect is supported by reading the device id. for atmel and winbond flash devices, auto-detect is not supporte d. software needs to use a mechanism to read the flash characteristics directly from the nvm. characteristic sst 25vf family pmc 25xxx family winbond w25x family atmel at25fs family size [bytes] 0.5 mb, 1 mb, 2 mb 64 kb, 128 kb 128 kb, 265 kb, 0.5 mb 256 kb, 0.5 mb maximum write burst size 1 byte 256 bytes 256 bytes 256 byte minimum block erase size 4 kb 4 kb 4 kb 4 kb device erase instruction 0x60 0xc7 0xc7 0x60 or 0xc7 minimum block erase instruction 1 1. flashes supported by the 82574, must have bits 7, 6, 4 and 0, all equal in the minimum block erase instruction. 0x20 0xd7 0x20 0x20 or 0xd7 64 kb block erase instruction 0x52 0xd8 0xd8 0x52 or 0xd8 read id instruction 0xab or 0x90 0xab 0xab or 0x90 0xab or 0x9f byte program time 20 ? s 30 ? s 100 ? s 30 ? s page program time - 5 ms 1.5 ms 7.7 ms minimum block erase time 25 ms 100 ms 150 ms 50 ms 64 kb erase time 25 ms 100 ms 1 s 200 ms
53 interconnects?82574 gbe controller 3.3.3.1 crc field crc calculation and management is done by software. 3.3.4 device operation with an external eeprom when the 82574 is connected to an external eeprom, it provides similar functionality to its predecessors with the following enhancements: ? enables a complete parallel interf ace for read/write to the eeprom. ? enables software to specify explicitly th e address length, thus eliminating the need for bit banging access even on an empty eeprom. 3.3.5 device operation with flash as previously stated, the 82574 merges the legacy eeprom and flash content in a single flash device. the 82574 copies the lower section in the flash device to an internal shadow ram. the interface to the sh adow ram is the same as the interface for an external eeprom device. this mechanism provides a seamless backward compatible interface for software to the legacy eeprom sp ace as if an external eeprom device is connected. the 82574 supports flash devices with a block erase size of 4 kb. note that many flash vendors are using the term sector differently. this document uses the term flash sector for a logic section of 4 kb. 3.3.5.1 lan configuration sectors flash devices require a block erase instruction in case a cell is modified from 0b to 1b. as a result, in order to update a single byte (or block of data) it is required to erase it first. the first addresses of the flash contai n the device configuration and must always be valid. the 82574 maintains two sectors of 4 kb: s0 and s1 for the configuration content. at least one of these two sectors is valid at any given time or else the 82574 is set by the hardware default. section 3.3.6 provides more details on the shadow ram and the first two sectors. 3.3.6 shadow ram the 82574 includes an internal 4 kb shadow ram of the first 4 kb flash sector(s). when the 82574 is connected to a flash device the legacy configuration parameters might reside in any of the first two 4 kb sectors (s0 or s1) in the flash. the 82574 copies that data to an internal shadow memory. the shadow ram emulates a seamless eeprom interface to the rest of the 825 74 and host cpu. this way the legacy configuration content is accessible to software and firmware on the same eeprom registers as on previous gbe controllers. figure 7 shows the shadow ram mapping and inte rface relative to the flash and the eeprom. the external eeprom and the shad ow ram share the same interface. the 82574 might access the eeprom or shadow ram according to the setting of the selshad bit in the eec register. by hardware default, the selshad bit is set by the nvmt strapping pin so that the eeprom is se lected in case of external eeprom and the shadow ram is selected in the case of external flash. note: access to the shadow ram uses the same in terface as the external eeprom with the exception that bit banging is not supported for the shadow ram.
82574 gbe controller?interconnects 54 figure 7. nvm shadow ram 3.3.6.1 flash mode the 82574 is initialized from the nvm. as pa rt of the initialization sequence, the 82574 copies the 4 kb content of s0 or s1 from the flash to the shadow ram. any access to the eeprom interface is directed to the sh adow ram. following any write access to the shadow ram by software or firmware, the data should also be updated in the flash. the 82574 maintains a watchdog timer defined by the flasht register to minimize flash updates. the timer is triggered by any write access to the shadow ram. the 82574 updates the flash from the shadow ram when the flasht timer expires or when firmware or software request explicitly to update the flash by setting the flupd bit in the fla register. the 82574 copies the cont ent of the shadow ram to the inactive configuration sector and then makes it the active one. the flash update sequence is listed in the steps that follow: 1. initiates block erase instruction(s) to the inactive sector (the inactive sector is defined by the inverse value of the sec1val bit in the eec register). 2. copy the shadow ram to the inactive sector while the signature word is copied last. 3. clear the signature word in the active sector to make it invalid. 4. toggle the state of the sec1val bit in the eec register to indicate that the inactive sector became the active one and visa versa. note: software should be aware of the fact th at actual programming to the flash might require a long latency following the write access to the shadow ram. software might poll the fludone bit in the flmngctl register to complete the flash programming, when required. 3.3.6.2 eeprom mode when the 82574 is attached to an extern al eeprom, any access to the eeprom interface is directed to the external eeprom. shadow ram a dd r ess 00 address 4k address 8k eeprom interface sector 0 sector 1 eeprom ? eec.selshad lan flash ?
55 interconnects?82574 gbe controller 3.3.7 nvm clients and interfaces there are several clients that might access the nvm or shadow ram listed in the following table. listed are the various clients and their access type to the nvm: software device driver, bios, firmware and hardware. table 26. clients and access type to the nvm 3.3.7.1 memory mapped host interface via lan flash bar software might read and write to the flash via the lan flash bar. the flash bar is mapped to the physical flash at offset 0x0. the 82574 supports read byte, word or dword and write byte through this interface. the host cpu waits (stalled) until the read access to the flash completes. note: one of the first two sectors of 4 kb in the flash are also reflected in the shadow ram. during normal operation, when software requires access to these sectors it should access the shadow ram. direct write accesses to the flash in this space via the flash bar might cause non-coherency between the flash and the shadow ram. note: flash bar access while fla.fl_req is a sserted (and granted) is forbidden. 3.3.7.2 csr mapped host interface software has bit banging and parallel accesses to the nvm or shadow ram via the registers in the csr space. the 82574 suppo rts the following cycles on the parallel interface: posted write, posted read, block erase and device erase. access to the configuration space in the first two sectors is directed via the eeprom registers regardless of the external physical device. a ccess to the rest of the nvm space is done according to the type of the physical device: flash registers in reference to flash and eeprom registers in reference to eeprom. eeprom csr registers are as follows: ? eec register for bit banging and device control ? eerd and eewr registers for parallel read and write access the flash csr registers are as follows: ? fla register and eec register for bit banging and device control client + interface nvm port nvm instructions host cpu on eec csr eeprom legacy bit banging host cpu on eerd and eewr eeprom parallel word read and write to eeprom or shadow ram (controlled by the eec.selshad bit) mng on eemng csr eeprom parallel word read and write to eeprom or shadow ram host cpu on fla csr flash legacy bit banging and flash erase instructions host cpu via bar flash read byte word and dword and byte programming 1 1. following a write instruction or erase instructions to the flash, the 82574 init iates seamless write enable before the write or erase instructions and polls the status at the end to check its completion. host cpu via flswxxx csr registers flash host write access to the flash no support for burst (multiple byte) writes direct hw accesses both read eeprom/shadow ram at device initialization
82574 gbe controller?interconnects 56 note: when software accesses the eeprom or flash spaces via the bit banging interface, it should follow these steps: 1. write a 1b to the request bit in the fla or eec registers. 2. poll the grant bit in the fla or eec registers until its ready. 3. access the nvm using the direct interface to its signaling via the eec or fla registers. 4. when access completes, software should clear the request bit. note: following a write or erase instruction, software should clear the request bit only after it checked that the cycles were completed by the nvm. 3.3.7.3 csr mapped firmware interface firmware might access the nvm or shadow ram via the nvm mng control registers in the csr space with the following capabilities: ? word read and write accesses to the eeprom or shadow ram via the eemngctl and eemngdata registers. ? read and write dma and block erase to the flash interface via the flmngctl and flmngdata registers. flash accesses are mapped to the physical nvm at offset 0x0. note that nominal accesses to the first two 4 kb sectors should be addressed to the shadow ram via the eeprom interface. 3.3.8 nvm write and erase sequence 3.3.8.1 software flow to the bit banging interface when software accesses the eeprom or flash csr registers to the bit banging interface it should follow these steps: 1. write a 1b to the request bit in the fla or eec registers. 2. poll the grant bit in the fla or eec registers until its ready. 3. access the nvm using the direct interface to its signaling via the eec or fla registers. 4. when access is achieved, software should clear the request bit. note that following a write or erase instruction, software should clear the request bit only after it checked that the cycles were completed by the nvm. 3.3.8.2 software byte program flow to the eeprom interface software initiates a write cycle to the nvm on the parallel eeprom as follows: 1. poll the done bit in the eewr register until its set. 2. write the data word, its address, and the start bit to the eewr register. as a response, hardware executes the following steps: case 1 - the 82574 is connected to a physical eeprom device: 1. initiate an autonomous write enable instruction. 2. initiate the program instruction right after the enable instruction. 3. poll the eeprom status until programming completes. 4. set the done bit in the eewr register.
57 interconnects?82574 gbe controller case 2 - the 82574 is connected to a physical flash device: 1. the 82574 writes the data to the shadow ram and sets the done bit in the eewr register. 2. update of the shadow ram to th e flash device as described in section 3.3.6 . 3.3.8.3 flash byte program flow software initiates a byte write cycle via the flash bar as follows: 1. write access to the flash must be first enabled in the flew field in the eec register. 2. poll the flbusy flag in the fla register until cleared. 3. write the data byte to the flash through the flash bar. 4. repeat the steps 2 and 3 if multiple bytes should be programmed. 5. clear the write enable in the flew field in the eec register to protect the flash device. as a response, hardware executes the following steps for each write access: 1. initiate autonomous write enable instruction. 2. initiate the program instruction right after the enable instruction. 3. poll the flash status until programming completes. 4. clear the flbusy bit in the fla register. note: this section explains only the actual progra mming of a single byte or multiple bytes. 3.3.8.4 flash erase flow device erase flow: erase instructions flow by software is almost identical to the program flow: 1. erase access to the flash must be first enabled in the flew field in the eec register. 2. poll the flbusy flag in the fla register until cleared. 3. set the flash erase bit (fl_er) in the fla register. 4. clear the erase enable in the flew field in the eec register to protect the flash device. 3.3.8.5 flash burst program flow the 82574 provides a burst engine that can be useful for initial programming of the entire flash image according to the following flow: 1. set the addr field with the byte resolution address in the flswctl register. 2. set the cmd field to 01b, which is the dma write setting in the flswctl register. 3. write the first 32 bits of data to the flswgdata register. 4. set the rdcnt field to the byte count number in the flswcnt register. 5. set the cmdv field in the flswctl register to start a dma write. 6. hardware starts accessing the spi bus and begins writing the first 32 bits from the flswdata register. 7. once hardware writes the 32-bit data to the flash, the done bit in the flswctl register is set indicating the next 32 bits are required.
82574 gbe controller?interconnects 58 8. until new data is written to the flswdata register, the flash clock is paused. 9. once data is written to the flswdata by the software, the done bit in the flswctl register is cleared and is set after hardware writes it to the flash. 10. after all bytes are written to the flash, hardware completes the cycle on the spi bus and sets the wrdone bit in the flswctl register indicating that the entire burst has completed. 3.3.8.6 flash programming flow of s0 and s1 other than initial programming of the flash device, software and firmware should not access the configuration sectors: s0 and s1. any access to the configuration flow should go to the shadow ram vi a the eeprom interface registers. 3.4 system manageme nt bus (smbus) note: the nc-si and smbus interfaces cannot be us ed together in the same implementation. one or the other is selected by the nvm image and loaded into the flash. smbus is a low speed (100 khz) serial bus used to connect various components in a system for manageability purposes. smbus is used as an interface to pass traffic between the manageability controller (mc) and the 82574. the interface can also be used to enable the mc to configure the 82574?s filters and management related capabilities. any device on the bus can be a master or a slave. the smbus uses two primary signals: smbclk and smbdat, to communicate. the 82574's smb_clk and smb_data pins corresp ond to these signals. both of these signals float high with board-level pull-ups. the smbus specification has defined various types of message protocols composed of individual bytes. the message protocols supported by the 82574 are described in section 8.0 . for more details about smbus, see the smbus specification and section 8.0 . 3.5 nc-si the nc-si interface in the 82574 is a connection to an external mc. it operates as a single interface with an external mc, wher e all traffic between the 82574 and the mc flows through the interface. see section 8.0 for more details. note: the nc-si and smbus interfaces cannot be us ed together in the same implementation. one or the other is selected by the nvm image and loaded into the flash. note: it is recommended that the mc turn off flow control packet reception on its mac to prevent the pause effect from a flow contro l packet that might arrive from the lan.
59 interconnects?82574 gbe controller figure 8. nc-si interface 3.5.1 interface specification the 82574 nc-si interface meets the rmii specif ication, rev. 1.2 as a phy-side device. the following nc-si capabilities are not supported by the 82574: ? collision detection - the interface su pports only full-duplex operation. ? mdio - mdio/mdc management traffic is not passed on nc-si. ? magic packets - magic packets are not detected at the 82574 nc-si receive end. ? flow-control - the 82574 doesn't suppor t flow control on this interface. 3.5.2 electrical characteristics the 82574 complies with the electrical charac teristics defined in the rmii specification. however, the 82574 is not 5 v dc tolerance and requires that signals conform to 3.3 v dc signaling. the 82574 dynamically drives its nc-si outp ut signals (nc-si_dv and nc-si_rx) as required by the sideband protocol: ? at power up, the 82574 floats the nc-si outputs. ? the 82574 drives the nc-si outputs as conf igured by the mc by the select package and deselect package commands. mac - media access control reconciliation pcs pma pmd nc-si mdi gmii mc 82574l llc - logical link control mac - media access control reconciliation medium mac - media access control reconciliation
82574 gbe controller?initialization 60 4.0 initialization 4.1 introduction this chapter discusses initialization steps. this includes: ? general hardware power-up state ? basic device configuration ? initialization of transmit and receive operation ? link configuration and software reset capability ? statistics initialization 4.2 reset operation the 82574 reset sources are as follows: ? internal power on reset- the 82574 has an internal mechanism for sensing the power pins. once power is up and stable, the 82574 implements an internal reset. this reset acts as a master reset of the entire chip. it is level sensitive, and while it is 0b holds all of the registers in reset. internal power on reset is an indication that device power supplies are all stable. internal power on reset changes state during system power up. ? pe_rst_n - indicates that both the power and the pcie clock sources are stable; a value of 0b indicates reset active. this pin asserts an internal reset also after a d3cold exit. most units are reset on the rising edge of pe_rst_n. the only exception is the pcie unit, which is ke pt in reset while pe_rst_n is active. ? device disable/dr disable - the 82574 en ters a device disable mode when the dev_off_n pin is asserted without shutdown (see section 5.4.4.4 ). the 82574 enters dr disable mode when certain conditions are met in the dr state (see section 5.4.4.3 ). ? in-band pcie reset - the 82574 generates an internal reset in response to a physical layer (phy) message from pcie or when the pcie link goes down (entry to polling or detect state). this reset is equi valent to pci reset in previous (pci) gbe controllers. ? d3hotd0 transition - this is also known as acpi reset. the 82574 generates an internal reset on the transition from d3hot power state to d0 (caused after configuration writes from d3 to d0 power st ate). note that this reset is per function and resets only the function that transitioned from d3hot to d0. ? software reset - software can reset the 82574 by writing the device reset bit of the device control (ctrl.rst) register. the 82574 re-reads the per-function nvm fields after a software reset. bits that are normally read from the nvm are reset to their default hardware values. note that this reset is per function and resets only the function that received the software reset. pci configuration space (configuration and mapping) of the device is unaffected.
61 initialization?82574 gbe controller ? force tco - this reset is generated when manageability logic is enabled. it is only generated if the reset on the force tco bit of the nvm's management control word is 1b. in pass-through mode it is gene rated when receiving a force tco smbus command with bit 1 or bit 7 set. ? eeprom reset - writing a 1b to the eeprom reset bit of the extended device control (ctrl_ext.ee_rst) register causes the 82574 to re-read the per-function configuration from the nvm, setting the appr opriate bits in the registers loaded by the nvm. ? phy reset - software can write a 1b to the phy reset bit of the device control (ctrl.phy_rst) register to reset the internal phy. the resets affect the following registers and logic: table 27. 82574 resets notes: 1. if d3cold is not supported, the wake-up context is reset ( pme_status and pme_en bits). 2. refers to bits in the wake-up control (wuc) register that are not part of the wake- up context (the pme_en and pme_status bits). 3. the wake-up status (wus) registers include the following: ?wus register. ?wake-up packet length. ? wake-up packet memory. reset name reset activation internal power on reset pe_ rst_ n device/dr disable in-band pcie reset d3hot d0 sw reset force tco ee reset phy reset notes pcie data path ?? ? ? load nvm ?? ? ? 6 ??? pci config registers ro ?? ? ? pci config registers rw ?? ? ? ? data path ?? ? ? ??? 5 wake up (pm) context ? 1 ? wake up control register ?? 2 wake up status registers ?? 3 mng unit ?? wake up management registers ?? ? ? ??? 4 phy ?? ? ? ? ? ? strapping pins ?? ? ?
82574 gbe controller?initialization 62 4. the wake-up management (wum) registers include the following: ? wake-up filter control. ? ip address valid. ? ipv4 address table ? ipv6 address table ? flexible filter length table ? flexible filter mask table 5. the following register fields do not fo llow the previously mentioned general rules: ? packet buffer allocation (pba) - rese t on internal power on reset only. ? packet buffer size (pbs) - reset on internal power on reset only. ? led configuration registers. ?the aux power detected bit in the pcie device status register is reset on internal power on reset and pcie power good only. ? fla - reset on internal power on reset only. 6. the nvm is loaded only when th e lan function exits d3hot state. in situations where the device is reset usin g the software reset ctrl.rst, the tx data lines will be forced to all zeros. this causes a substantial number of symbol errors to be detected by the link partner. 4.3 power up 4.3.1 power-up sequence figure 9 through figure 15 shows the 82574?s power-up sequencing. figure 9 shows a high-level view of the power sequence, while figure 10 through figure 15 provides a more detailed description of each state.
63 initialization?82574 gbe controller figure 9. 82574 power up - general flow a b flash eeprom start power-on-reset load eeprom load flash c initialize manageability and phy d read nvm after perst# de-assertion e initialize pcie and phy bring up pcie link
82574 gbe controller?initialization 64 figure 10. 82574 initiali zation - power-on reset stage comments duration (ms) note legend power ramp up (3.3 v dc, 1.9 v dc, 1.05 v dc) start xosc stabe from power-up <10 internal power-on- reset triggers from power-up <50 82574 samples nvmt strapping determine nvm type 0 a b flash eeprom start
65 initialization?82574 gbe controller figure 11. 82574 init ialization - flash load notes: 1. a 4 kb sector is read in a single burst, so the packet overhead is negligible. the rate is 4 kb x 8 bits / 15.625 mb/s = 2.1 ms. 2. the shadow ram is read at the rate of on e word every ~3 clocks of 62.5 mhz, or ~50 ns per word. the 64 words are read in 3.2 ms. 3. clear write protection is required for an sst* flash only. the instruction codes that are required to initiate are hardwired in the design as defined by sst 25xxx flash family: code 0x50 for write status enable and code 0x01 for status write. the 82574 writes a data of 0x00 to the status word which clears all protection. software accesses to the flash are not executed until this step completes. read signature at word 0x12 ~0 load sector 0 to shadow ram set eec.shadv & clear eec.sel1val ~2.1 1 read signature at word 2k+0x12 ~0 load base area (0x00- 0x40) from shadow ram ~0.0032 2 a good signature bad signature c good signature bad signature load sector 1 to shadow ram set eec.shadv & set eec.sel1val ~2.1 1 82574 set to default values set eec.auto_rd 0 clear write protection set flash write status enable and write status 0.008 3
82574 gbe controller?initialization 66 figure 12. 82574 initia lization - eeprom load each word is read separately using a 5-by te command (1 byte instruction, 2 byte address, and 2 byte data). total time at 2 mb /s is 64 words x 5 bytes x 8 bits/2 mb/s = 1.28 ms. the rate is 20 ? s per word. detect address length of 1b or 2b based on signature ~0 load base area (0x00- 0x40) from eeprom set eec.auto_rd ~1.28 3 b good signature bad signature c 82574 set to default values set eec.auto_rd 0
67 initialization?82574 gbe controller figure 13. 82574 initializat ion - phy and manageability each pcie register write takes ~20 pcie clocks (31.25 mhz) per table entry <=> 640 ns per dword. each phy register write takes those 20 clocks + 64 mdc cycles on the mdio interface (2.5 mhz) => 26.24 ms per dword. therefore, the total is 640 ns x 4 + 26.24 ms x 16 = 422 ms. each pcie register write takes ~20 pcie clocks (31.25 mhz) per table entry <=> 640 ns per dword. therefore, the bottleneck is the eeprom at 40 ms per dword. each phy register write takes those 20 clocks + 64 mdc cycles on the mdio interface (2.5 mhz) => 26.24 ms per dword. therefore, th e bottleneck is the eeprom at 40 ms per dword. the 16+4 entries take 20 dwords x 40 ms = 0.8 s. enable manageability and/or wake up based on nvm configuration based on mng_mode bits in nvm word 0x0f ~0 load extended configuration from eeprom clear sw/hw nvm semaphore ~0.8 5 c need to load extended configuration d 82574 set to default values clear sw/hw nvm semaphore 0 enable the phy if needed phy was inactive up to now 11 no need to load extended configuration load extended configuration from shadow ram clear sw/hw nvm semaphore ~0.42 4 flash eeprom
82574 gbe controller?initialization 68 figure 14. 82574 initialization - nvm load after pe_rst_n perst# is de-asserted by the platform phy is powered down ~0 d nvmt strapping is sampled determine nvm type ~0 flash eeprom no nvm load base area (0x00- 0x40) from shadow ram set eec.auto_rd ~0.0032 2 load base area (0x00- 0x40) from eeprom set eec.auto_rd ~1.28 3 82574 set to default values set eec.auto_rd 0 e check valid shadow and signature ~0 detect address length of 1b or 2b based on signature ~0
69 initialization?82574 gbe controller figure 15. 82574 initia lization - phy and pcie load extended configuration from eeprom clear sw/hw nvm semaphore ~0.8 5 e enable the phy phy was in power-down during nvm load 11 load extended configuration from shadow ram clear sw/hw nvm semaphore ~0.42 4 flash eeprom start pcie link training must start < 20 s after perst# de-assertion pcie link ready to accept configuration requests must start < 100 s after perst#
82574 gbe controller?initialization 70 4.3.2 timing diagram figure 16. power-up timing diagram table 28. notes to power-up timing diagram d-state d0u nvm load d0a phy state pcie link up l0 manageability / wake 4 5 7 dr 8 9 10 3 power power-on-reset (internal) 2 pcie reference clock perst# xosc 1 6 txo g tee tee 11 12 tpgtrn 13 tpgres tpgcfg tpwrgd -clk tpvp gl tpp g auto read ext. conf. auto read ext. conf. powered-down active / down note 1 xosc is stable txog after power is stable 2 internal reset is released after all powe r supplies are good and tppg after xosc is stable. 3 an nvm read starts on the rising edge of the internal reset or internal power on reset#. 4 after reading the nvm, phy might exit power down mode. 5 apm wake up and/or manageability might be enabled based on nvm contents. 6 the pcie reference clock is valid tpwrgd-clk before the de-assertion of pe_rst_n (according to pcie specification). 7 pe_rst_n is de-asserted tpvpgl after power is stable (according to pcie specification). 8 de-assertion of pe_rst_n causes the nvm to be re-read, asserts phy power- down, and disables wake up. 9 after reading the nvm, phy exits power-down mode. 10 link training starts after tpgtrn from pe_rst_n de-assertion. 11 a first pcie configuration access might a rrive after tpgcfg from pe_rst_n de- assertion. 12 a first pci configuration response can be sent after tpgres from pe_rst_n de- assertion 13 writing a 1b to the memory access enable bit in the pci command register transitions the device from d0u to d0 state.
71 initialization?82574 gbe controller 4.4 global reset (pe_rst_n, pcie in-band reset) 4.4.1 reset sequence figure 17 and figure 18 show the 82574's sequence following global reset (pe_rst_n de-assertion or pcie in-band reset) and until the device is ready to accept host commands. figure 17. 82574 global reset - nvm load reset (pe_rst_# de- assertion or in-band) phy is powered down ~0 nvmt strapping is sampled determine nvm type ~0 flash eeprom no nvm load base area (0x00- 0x40) from shadow ram set eec.auto_rd ~0.0032 2 load base area (0x00- 0x40) from eeprom set eec.auto_rd ~1.28 3 82574 set to default values set eec.auto_rd 0 a check valid shadow and signature ~0 detect address length of 1b or 2b based on signature ~0
82574 gbe controller?initialization 72 figure 18. 82574 global reset - phy and pcie 4.4.2 timing diagram the following timing diagram shows the 82574?s behavior through a pe_rst_n reset. load extended configuration from eeprom clear sw/hw nvm semaphore ~0.8 5 a enable the phy phy was in power-down during nvm load 11 load extended configuration from shadow ram clear sw/hw nvm semaphore ~0.42 4 flash eeprom start pcie link training must start < 80 s after perst# de-assertion pcie link ready to accept configuration requests must start < 100 s after perst#
73 initialization?82574 gbe controller figure 19. global reset timing diagram table 29. notes to global reset timing diagram d-state d0u nvm load d0a phy state pcie link up l0 wake 2 1 dr 4 5 7 pcie reference clock perst# tee 8 9 tpgtrn 10 tpgres tpgcfg auto read ext. conf. active active / down tclkpg l0 any mode apm d0a 3 t pwrgd-clk 6 note 1 the system must assert pe_rst_n before stopping the pcie reference clock. it must also wait tl2clk after link transition to l2/l3 before stopping the reference clock. 2 on assertion of pe_rst_n, the 82574 tran sitions to dr state and the pcie link transition to electrical idle. the ph y state is defined by the wake and manageability configuration. 3 the system starts the pcie reference clock tpwrgd-clk before de-assertion pe_rst_n. 4 de-assertion of pe_rst_n causes the nvm to be re-read, asserts phy power- down, and disables wake up. 5 after reading the nvm base area, phy reset is de-asserted. apm wake might be enabled. 6 link training starts after the nvm was fully read (including extended configuration if needed). 7 link training starts after tpgtrn from pe_rst_n de-assertion. 8 a first pcie configuration access might a rrive after tpgcfg from pe_rst_n de- assertion. 9 a first pci configuration response can be sent after tpgres from pe_rst_n de- assertion. 10 writing a 1b to the memory access enable bit in the pci command register transitions the device from d0u to d0 state.
82574 gbe controller?initialization 74 4.5 timing parameters 4.5.1 timing requirements the 82574 requires the following start-up and power state transitions. table 30. timing requirements 4.6 software initialization sequence the following sequence of commands is typically issued to the device by the software device driver in order to initialize the 82574 to normal operation. the major initialization steps are: 1. disable interrupts - see interrupts during initialization. 2. issue global reset and perform general configuration - see global reset and general configuration. 3. setup the phy and the link - see link se tup mechanisms and control/status bit summary. 4. initialize all statistical counters - see initialization of statistics. 5. initialize receive - see receive initialization. 6. initialize transmit - see transmit initialization. 7. enable interrupts - see interrupts during initialization. parameter description min max notes txog xosc stable from power stable 10 ms tpwrgd- clk pcie clock valid to pcie power good 100 ? s - according to pc ie specification. tpvpgl power rails stable to pcie pe_rst_n inactive 100 ms - according to pcie specification. tpgcfg external pe_rst_n signal to first configuration cycle. 100 ms according to pcie specification. td0mem device programmed from d3h to d0 state to next device access 10 ms according to pci power management specification. tl2pg l2 link transition to pe_rst_n assertion 0 ns according to pcie specification. tl2clk l2 link transition to removal of pcie reference clock 100 ns according to pcie specification. tc l k p g pe_rst_n assertion to removal of pcie reference clock 0 ns according to pcie specification. tpgdl pe_rst_n assertion time 100 ? s according to pcie specification.
75 initialization?82574 gbe controller 4.6.1 interrupts during initialization most drivers disable interrupts during initialization to prevent re-entrancy. interrupts are disabled by writing to the imc register. no te that the interrupts need to be disabled also after issuing a global reset, so a typical driver initialization flow is: 1. disable interrupts 2. issue a global reset 3. disable interrupts (again) 4. ? after the initialization completes, a typica l driver enables the desired interrupts by writing to the ims register. 4.6.2 global reset and general configuration device initialization typically starts with a global reset that puts the device into a known state and enables the software device driver to continue the initialization sequence. several values in the device control (ctrl) re gister need to be set at power up or after a device reset for normal operation. ? full duplex should be set per interface negotiation (if done in software), or is set by the hardware if the interface is auto-negot iating. this is reflected in the device status register in the auto-negotiating case. a default value is loaded from the nvm. ? speed is determined via auto-negotiation by the phy, or forced by software if the link is forced. status information for speed is also readable in status. ? ilos should normally be set to 0b. if using xoff flow control, program the fcah , fcal, and fct registers. if not, they should be written with 0x0. gcr bit 22 should be set to 1b by software during initialization. 4.6.3 link setup mechanisms and control/status bit summary 4.6.3.1 phy initialization refer to the phy documentation for the initialization and link setup steps. the device driver uses the mdic register to initialize the phy and setup the link. 4.6.3.2 mac/phy link setup this section summarizes the various means of establishing proper mac/phy link setups, differences in mac ctrl register se ttings for each mechanism, and the relevant mac status bits. the methods are ordered in terms of preference (the first mechanism being the most preferred). ? mac settings automaticall y based on duplex and speed resolved by phy. (ctrl.frcdplx = 0b, ctrl.frc spd = 0b, ctrl.asde = 0b) ? ctrl.fd - don't care; duplex setting is established from phy's internal indication to the mac (fdx) after phy has auto-negotiated a successful link-up. ? ctrl.slu - must be set to 1b by soft ware to enable communications between mac and phy. ? ctrl.rfce - must be set by software af ter reading flow control resolution from phy registers.
82574 gbe controller?initialization 76 ? ctrl.tfce - must be set by software af ter reading flow control resolution from phy registers. ? ctrl.speed - don't care; speed setting is established from phy's internal indication to the mac (spd_ind) afte r phy has auto-negotiated a successful link-up. ? status.fd - reflects the actual duplex setting (fdx) negotiated by the phy and indicated to the mac. ? status.lu - reflects link indication (link) from the phy qualified with ctrl.slu (set to 1b). ? status.speed - reflects actual spee d setting negotiated by the phy and indicated to the mac (spd_ind). ? mac duplex setting automatically ba sed on resolution of phy, software- forced mac/phy speed. (ctrl.frcdplx = 0b, ctrl.frcspd = 1b, ctrl.asde = don't care) ? ctrl.fd - don't care; duplex setting is established from phy's internal indication to the mac (fdx) after phy has auto-negotiated a successful link-up. ? ctrl.slu - must be set to 1b by software to enable communications between the mac and phy. ? ctrl.rfce - must be set by software after reading flow control resolution from phy registers. ? ctrl.tfce - must be set by software af ter reading flow control resolution from the phy registers. ? ctrl.speed - set by software to desi red link speed (must match speed setting of phy). ? status.fd - reflects the actual duplex setting (fdx) negotiated by the phy and indicated to mac. ? status.lu - reflects link indication (link) from the phy qualified with ctrl.slu (set to 1b). ? status.speed - reflects mac forced speed setting written in ctrl.speed. ? mac duplex and speed settings forced by software based on resolution of phy. (ctrl.frcdplx = 1b, ctrl.frc spd = 1b, ctrl.asde = don't care) ? ctrl.fd . - set by software based on re ading phy status register after the phy has auto-negotiated a successful link-up. ? ctrl.slu . - must be set to 1b by so ftware to enable communications between the mac and phy. ? ctrl.rfce - must be set by software after reading flow control resolution from the phy registers. ? ctrl.tfce - must be set by software af ter reading flow control resolution from the phy registers. ? ctrl.speed - set by software based on reading phy status register after the phy has auto-negotiated a successful link-up. ? status.fd - reflects the mac forced duplex setting written to ctrl.fd. ? status.lu - reflects link indication (link) from the phy qualified with ctrl.slu (set to 1b). ? status.speed - reflects mac forced speed setting written in ctrl.speed.
77 initialization?82574 gbe controller ? mac/phy duplex and speed settings bo th forced by software (fully-forced link setup). (ctrl.frcdplx = 1b, ct rl.frcspd = 1b, ctrl.slu = 1b) ? ctrl.fd - set by software to desired full-/half- duplex operation (must match duplex setting of the phy). ? ctrl.slu - must be set to 1b by soft ware to enable communications between the mac and phy. the phy must also be forced/configured to indicate positive link indication (link) to the mac. ? ctrl.rfce - must be set by software to the desired flow-control operation (must match flow-control settings of the phy). ? ctrl.tfce - must be set by software to the desired flow-control operation (must match flow-control settings of the phy). ? ctrl.speed - set by software to desire d link speed (must match speed setting of the phy). ? status.fd - reflects the mac duplex se tting written by software to ctrl.fd. ? status.lu - reflects 1b (positive link in dication link from phy qualified with ctrl.slu). note: since both ctrl.slu and the phy link indicati on link are forced, this bit set does not guarantee that operation of the link has been truly established. ? status.speed - reflects mac forced speed setting writte n in ctrl.speed. 4.6.4 initialization of statistics statistics registers are hardware-initialized to values as detailed in each particular register's description. the initialization of these registers begins at transition to d0 active power state (when internal registers become accessible, as enabled by setting the memory access enable field of the pcie command register), and is guaranteed to complete within 1 ms of this transition. access to statistics registers prior to this interval might return indeterminate values. all of the statistical counters are cleared on read and a typical software device driver reads them (thus making them zero) as a part of the initialization sequence. 4.6.5 receive initialization program the receive address register(s) per the station address. this can come from the nvm or from any other means, for example, on some systems, this comes from the system eeprom not the nvm on a network interface card (nic). set up the multicast table array (mta) per software. this generally means zeroing all entries initially and adding in entries as requested. program the interrupt mask register to pass any interrupt that the software device driver cares about. suggested bits includ e rxt, rxo, rxdmt and lsc. there is no reason to enable the transmit interrupts. program rctl with appropriate values. if initializ ing it at this stage, it is best to leave the receive logic disabled (en = 0b) until the receive descriptor ring has been initialized. if vlans are not used, software should clear the vfe bit. then there is no need to initialize the vfta array. select th e receive descriptor type. note that if using the header split rx descriptors, tail and head registers should be incremented by two per descriptor.
82574 gbe controller?initialization 78 4.6.5.1 initialize the re ceive control register to properly receive packets requires simply th at the receiver is enabled. this should be done only after all other setup is accomplished. if software uses the receive descriptor minimum threshold interrupt, that value should be set. the following should be done once per receive queue: ? allocate a region of memory for the receive descriptor list. ? receive buffers of appropriate size should be allocated and pointers to these buffers should be stored in the descriptor ring. ? program the descriptor base address with the address of the region. ? set the length register to the size of the descriptor ring. ? if needed, program the head and tail regist ers. note: the head and tail pointers are initialized (by hardware) to zero after a power-on or a software-initiated device reset. ? the tail pointer should be set to point one de scriptor beyond the end. 4.6.6 transmit initialization program the txdctl register with the desired tx descriptor write-back policy. suggested values are: ? gran = 1b (descriptors) ?wthresh = 1b ? all other fields 0b. program the tctl register. suggested configuration: ? ct = 0x0f (16d collision) ? cold: hdx = 511 (0x1ff); fdx = 63 (0x03f) ?psp = 1b ?en=1b ? all other fields 0b the following should be done once per transmit queue: ? allocate a region of memory for the transmit descriptor list. ? program the descriptor base address with the address of the region. ? set the length register to the size of the descriptor ring. ? if needed, program the head and tail registers. note: note: the head and tail pointers are initializ ed (by hardware) to zero after a power-on or a software-initiated device reset.
79 initialization?82574 gbe controller program the tipg register with the following (decimal) values to get the minimum legal ipg: ?ipgt = 8 ?ipgr1 = 2 ?ipgr2 = 10 note: ipgr1 and ipgr2 are not needed in full-duplex, but it is easier to always program them to the values listed. initialize the transmit descriptor registers (tdbal, tdbah, tdl, tdh, and tdt).
82574 gbe controller?power management and delivery 80 5.0 power management and delivery the 82574 supports the advanced config uration and power interface (acpi 2.0) specification as well as advanced power ma nagement (apm). this section describes how power management is implemented in the 82574. implementation requirements were obtained from the following documents: ? pci bus power management interface specification .................................rev 1.1 ? pci express base specification .............................................................rev.1.1 ? acpi specification ...............................................................................rev 2.0 ? pci express card electromechanical specification ....................................rev 1.1 5.1 assumptions the following assumptions apply to the implementation of power management for the 82574. ? the software device driver sets up the filters prior to the system transition of the 82574 to a d3 state. ? prior to transition from d0 to the d3 state, the operating system ensures that the software device driver has been disabled. see section 5.4.4.2.3 for the 82574 behavior on d3 entry. ? no wake up capability, except apm wake up if enabled in the nvm, is required after the system puts the 82574 in d3 state and then returns the 82574 to d0. ?if the apmpme bit in the wake up control (wuc) register is 1b, it is permissible to assert pe_wake_n even when pme_en is 0b. 5.2 power consumption ta b l e 8 5 and ta b l e 8 6 list power consumption in various modes (see section 12.5 ). the following sections describe the requir ements in specific power states.
81 power management and delivery?82574 gbe controller 5.3 power delivery 82574 operates from the following power rails: ? a 3.3 v dc power rail for internal power regulation and for periphery. the 3.3 v dc should be supplied by an external power source. ? a 1.9 v dc power rail. ? a 1.05 v dc power rail. 5.3.1 the 1.9 v dc rail the 1.9 v dc rail is used for core and i/o functi ons. it also feeds internal regulators to a lower 1.05 v dc core voltage. the 1.9 v dc rail can be generated in one of two ways: ? an external power supply not dependent on support from the 82574. for example, the platform designer might choose to route a platform-available 1.9 v dc supply to the 82574. ? internal voltage regulator solution, where the control logic for the power transistor is embedded in the 82574, while the power transistor is placed externally. control is done using the ctrl18 pin. 5.3.2 the 1.05 v dc rail the 1.05 v dc rail is used for core functions and can be generated in one of the following ways: ? an external power supply not dependent on support from the 82574. ? internal voltage regulator solution, where the control logic for the power transistor is embedded in the 82574, while the power transistor is placed externally. control is done using the ctrl10 pin. ? a complete internal voltage regulator so lution. the internal voltage regulator can be disabled by the dis_reg10 pin. 5.4 power management 5.4.1 82574 power states the 82574 supports d0 and d3 power states defined in the pci power management and pcie specifications. d0 is divided into two sub-states: d0u (d0 un-initialized), and d0a (d0 active). in addition, the 82574 supports a dr state that is entered when pe_rst_n is asserted (including the d3cold state). figure 20 shows the power states and transitions between them.
82574 gbe controller?power management and delivery 82 figure 20. power management state diagram 5.4.2 auxiliary power usage if advd3wuc =1b, the 82574 uses the aux_pwr indication that auxiliary power is available to the controller, and therefore advertises d3cold wake up support. the amount of power required for the function (whi ch includes the entire nic) is advertised in the power management data register, which is loaded from the nvm. if d3cold is supported, the pme_en and pme_status bits of the power management control/status register (pmcsr), as well as their shadow bits in the wake up control (wuc) register is reset only by the powe r up reset (detection of power rising). the only effect of setting aux_pwr to 1b is advertising d3cold wake up support and changing the reset function of pme_en and pme_status . aux_pwr is a strapping option in the 82574. the 82574 tracks the pme_en bit of the power management control / status register (pmcsr) and the auxiliary (aux) power pm enable bit of the pcie device control register to determine the power it might consume (and therefore its power state) in the d3cold state (internal dr state). dr d0u d0a d3 pe_rst_n de- assertion and eeprom read done pe_rst_n assertion pe_rst_n assertion pe_rst_n assertion write 11b to power state write 00b to power state enable master or slave access internal power on reset assertion hot (in-band) reset
83 power management and delivery?82574 gbe controller the aux power pm enable bit in the pcie device contro l register determines if the 82574 complies with the auxiliary power regime defined in the pcie specification. if set, the 82574 might consume higher power for any purpose (such as, even if pme_en is not set). if the aux power pm enable bit of the pcie device control register is cleared, higher power consumption is determined by the pci-pm legacy pme_en bit in the power management control / status register (pmcsr). note: in the current implementation, the aux power pm enable bit is hardwired to 0b. 5.4.3 power limits by certain form factors ta b l e 3 1 lists the power limitations introduced by different form factors. table 31. power limits by form factor 1. this auxiliary current limit only applies wh en the primary 3.3 v dc voltage source is not available (such as, the nic is in a low power d3 state. 2. the 82574 exceeds the allowed power cons umption in gbe speed. it therefore cannot run from aux power, restricting the 82574 speed in dr state. the 82574 therefore implements two nvm bits to disable gbe operation in certain cases: 1. the disable 1000 nvm bit disables 1000 mb/s operation under all conditions. 2. the disable 1000 in non-d0a csr bit disables 1000 mb/s operation in non-d0a states. if disable 1000 in non-d0a is set, and the 82574 is at gbe speed on entry to a non-d0a state, then the device removes advertisement for 1000 mb/s and auto-negotiates. the disable 1000 in non-d0a bit is loaded from the nvm. note: the 82574 restarts link auto-negotiation each time it transitions from a state where gbe speed is enabled to a state where gbe speed is disabled, or vice versa. for example, if disable 1000 in non-d0a is set but disable 1000 is clear, the 82574 restarts link auto-negotiation on transition from d0 state to d3 or dr states. 5.4.4 power states 5.4.4.1 d0 uninitialized state the d0u state is a low-power state used after pe_rst_n is de-asserted following a power up (cold or warm), on hot reset (in-band reset through a pcie physical layer message), or on d3 exit. form factor lom pcie nic (x1 connector) main 3 a @ 3.3 v dc 3 a @ 3.3 v dc auxiliary (aux enabled) 375 ma @ 3.3 v dc 375 ma @ 3.3 v dc auxiliary (aux disabled) 20 ma @ 3.3 v dc
82574 gbe controller?power management and delivery 84 when entering the d0u state, the 82574 disables all wake ups and asserts a reset to the phy while the nvm is being read. if the apm mode bit in the nvm's initialization control word 2 is set, then apm wake up is enabled. 5.4.4.1.1 entry into d0u state d0u is reached from either the dr state (on assertion of internal pwrgd) or the d3hot state (by configuration software writing a value of 00b to the power state field of the pci-pm registers). asserting internal pwrgd means that the enti re state of the device is cleared, other than sticky bits. the state is loaded from the nvm, followed by establishment of the pcie link. once this is done, config uration software can access the device. on a transition from the d3 to d0u state, the 82574?s pci configuration space is not reset. per the pci power management specification (revision 1.1, section 5.4), software ?will need to perform a full re-initi alization of the function including its pci configuration space.? 5.4.4.2 d0active state once memory space is enabled, all internal clocks are ac tivated and the 82574 enters an active state. it can transmit and rece ive packets if properly configured by the software device driver. the phy is enabled or re-enabled by the software device driver to operate / auto-negotiate to full-line sp eed/power if not already operating at full capability. any apm wakeup previously active remains active. the software device driver can deactivate apm wakeup by writing to the wuc register, or activate other wake-up filters by writing to the wake up filter control (wufc) register. note: fields that are auto-loaded from the nvm, like wuc.apme, should be configured through an nvm setting, because d3 to d0 power state transition causes nvm auto- read to reload those bits from the nvm. 5.4.4.2.1 entry to d0a state d0a is entered from the d0u state by writing a 1b to the memory access enable or the i/o access enable bit in the pci command register. the dma, mac, and phy are enabled. manageability is also enab led if configured from the nvm. 5.4.4.2.2 d3 state (=pci-pm d3hot) when the system writes a 11b to the power state field in the pmcsr, the 82574 transitions to d3. any wake-up filter setting s that were enabled before entering this reset state are maintained. upon transiti on to d3 state, the 82574 clears the memory access enable and i/o access enable bits of the pci command register, which disables memory access decode. in d3, the 82574 only responds to pci configuration accesses and does not generate master cycles. a d3 state is followed by either a d0u state (in preparation for a d0a state) or by a transition to dr state (pci-pm d3cold state). to transition back to d0u, the system writes a 00b to the power state field of the pmcsr. transition to dr state is through pe_rst_n assertion.
85 power management and delivery?82574 gbe controller 5.4.4.2.3 entry to d3 state transition to the d3 state is through a configuration write to the power state field of the pci-pm registers. prior to transition from d0 to the d3 state, the software device driver disables scheduling of further tasks to the 82574, as follows: ? it masks all interrupts ? it does not write to the transmit descriptor tail (tdt) register ? it does not write to the receive descriptor tail (rdt) register ? operates the master disable algorithm as defined in section 3.1.3.10 . if wake-up capability is needed, the soft ware device driver should set up the appropriate wake-up registers and the system should write a 1b to the pme_en bit in the pmcsr or to the aux power pm enable bit of the pcie device control register prior to the transition to d3. as a response to being programmed into the d3 state, the 82574 brings its pcie link into the l1 link state. as part of the transition into l1 state, the 82574 suspends scheduling of new transaction layer protocols (tlps) and waits for the completion of all previous tlps it has sent. the 82574 clears the memory access enable and i/o access enable bits of the pci command register, which disables memory access decode. any receive packets that have not been transferred into system memory are kept in the device (and discarded later on d3 exit). an y transmit packets that were not sent, can still be transmitted (assuming the ethernet link is up). to reduce power consumption, if any of asf manageability, apm wake, and pci-pm pme is enabled, the phy auto-negotiates to a lower link speed on d3 entry (see section 5.4.4.2.3 ).
82574 gbe controller?power management and delivery 86 5.4.4.3 dr state transition to dr state is initiated on three occasions: ? at system power up - dr state begins wi th the assertion of the internal power detection circuit (internal power on reset) and ends with the assertion of the internal pwrgd signal (indicating that the system de-asserted its pcie pe_rst_n signal). ? at transition from a d0a state - during operation, the system might assert pcie pe_rst_n at any time. in an acpi system, a system transition to the g2/s5 state causes a transition from d0a to dr state. ? at transition from a d3 state - the system transitions the device into the dr state by asserting pcie pe_rst_n. the 82574 meets the restrictions on using auxiliary power, defined in the pci-pm specification: 1. if wake is enabled (either apm wake, acpi wake, or manageability), then the 82574 might consume up to 375 ma @ 3.3 v dc. 2. if wake is disabled, then the 82574 might consume up to 20 ma @ 3.3 v dc. the restrictions apply to all cases of dr state (power up, d3 entry, dr entry from d0). note: when the wake configuration is unknown (for example, during power up before an nvm read), the 82574 must meet the 20 ma limit. the system might maintain pe_rst_n asserted for an arbitrary time. the de-assertion (rising edge) of pe_rst_n causes a transition to d0u state. any wake-up filter settings that were enab led before entering this reset state are maintained. 5.4.4.3.1 entry to dr state dr entry on platform power up begins by asserting the internal power detection circuit (internal power on reset). the nvm is read an d determines device configuration. if the apm enable bit in the nvm's initialization control word 2 is set, then apm wake up is enabled. the phy and mac states are dete rmined by the state of manageability and apm wake. to reduce power consumption, if manageability or apm wake is enabled, the phy auto-negotiates to a lower link speed on dr entry (see section 5.4.4.3.1 ). the pcie link is not enabled in dr state follo wing system power up (since pers# is asserted). entry to dr state from d0a state is by asserting the pe_rst_n signal. an acpi transition to the g2/s5 state is reflected in a device transition from d0a to dr state. the transition might be orderly (for exampl e, the designer selected the shut down option), in which case the software device dr iver might have a chance to intervene. or, it might be an emergency transition (such as, power button override), in which case, the software device driver is not notified. to reduce power consumption, if any of manageability, apm wake or pci-pm pme is enabled, the phy auto-negotiates to a lower link speed on d0a to dr transition (see section 5.4.4.3.1 ). transition from d3 state to dr state is done by asserting the pe_rst_n signal. prior to that, the system initiates a transition of the pcie link from the l1 state to either the l2 or l3 state. the link enters l2 state if pci-pm pme is enabled.
87 power management and delivery?82574 gbe controller 5.4.4.4 device disable for a lom design, it might be desirable for the system to provide bios-setup capability for selectively enabling or disabling lom devices. this might allow the designers more control over system resource-management, av oid conflicts with add-in nic solutions, etc. the 82574 provides support for selectively enabling or disabling it. ? device disable - the device is in a global power down state. device disable is initiated by asserting the asynchronous dev_off_n pin. the dev_off_n pin has an internal pull-up resistor, so that it can be left not connected to enable device operation. while in device disable mode, the pcie link is in l3 state. the phy is in power-down mode. all internal clocks are gated. output buffers are tri-stated. asserting or de-asserting pcie pe_rst_n does not have any effect while the device is in device disable mode (for example, the devi ce stays in the respective mode as long as dev_off_n is asserted). however, the device might momentarily exit the device disable mode from the time pcie pe_rst_n is de-asserted again and until the nvm is read. note: note to system designers: the dev_off_n pi n should maintain its state during system reset and system sleep states. it should also insure the proper default value on system power up. for example, a system designer could use a gpio pin that defaults to 1b (enable) and is on system suspend power (for example, it maintains state in s0-s5 acpi states). 5.4.4.5 link-disconnect in any of d0u, d0a, d3, or dr states, the 82574 enters a link-disconnect state if it detects a link-disconnect condition on the et hernet link. note that the link-disconnect state is invisible to software (other than the link energy detect bit state). in particular, while in d0 state, software might be able to access any of the device registers as in a link-connect state. during link disconnect mode, the ccm pll might be shut down. see section 5.4.4.5 . 5.4.5 timing of powe r-state transitions the following sections give detailed timing for the state transitions. in the diagrams the dotted connecting lines represent the 82574 requirements, while the solid connecting lines represent the 82574 guarantees. the timing diagrams are not to scale. the clocks edges are shown to indicate running clocks only, they are not used to indicate the actual number of cycles for any operation. 5.4.5.1 transition from d0a to d3 and back without pe_rst_n figure 21 shows the 82574?s reaction to a d3 transition.
82574 gbe controller?power management and delivery 88 figure 21. d3hot transi tion timing diagram table 32. notes to d3hot timing diagram 5.4.5.2 transition from d0a to d3 and back with pe_rst_n figure 22 shows the 82574?s reaction to a d3 transition. pcie reference clock pcie pwrgd phy reset pcie link reading eeprom auto read dstate d3 d0u d0 wake up enabled memory access enable l0 d3 write apm / smbus any mode d0 write d0a 2 l1 l0 phy power state full full power-managed power- managed t ee 1 3 4 5 6 7 t d0me m ext. conf. note description 1 writing 11b to the power state field of the pmcsr transitions the 82574 to d3. 2 the system keeps the 82574 in d3 state for an arbitrary amount of time. 3 to exit d3 state the system writes 00b to the power state field of the pmcsr. 4 apm wake up or smbus mode can be enabled based on what is read in the nvm. 5 after reading the nvm, reset to the phy is de-ass erted. the phy operates at reduced-speed if apm wake up or smbus is enabled, else powered-down. 6 the system can delay an arbitrary time before enabling memory access. 7 writing a 1b to the memory access enable bit or to the i/o access enable bit in the pci command register transitions the 82574 from d0u to d0 state and returns the phy to full-power/speed operation.
89 power management and delivery?82574 gbe controller figure 22. d3cold transition timing diagram table 33. notes to d3cold timing diagram pcie reference clock pcie pwrgd dstate phy power state d0u reading eeprom auto read d0a power-managed full reset to phy (active low) pcie link wake up enabled dr 11 any mode apm/smbus full d3 write d0a d3 15 l0 l1 l2/l3 l0 1 2 6 13 14 3 4a 4b 12 internal pcie clock (2.5 ghz) internal pwrgd (pll) 9 7 8 10 tee tppg- clkint t pgtrn t pgres tpgcfg tclkp r t pgdl t l2clk t clkp g t pwrgd-clk t l2pg 5 l0 ext. conf. note description 1 writing 11b to the power state field of the pmcsr transitions the 82574 to d3. pcie link transitions to l1 state. 2 the system can delay an arbitrary amount of time between setting d3 mode and transition the link to an l2 or l3 state. 3 following link transition, pe_rst_n is asserted. 4 the system must assert pe_rst_n before stopping th e pcie reference clock. it must also wait tl2clk after link transition to l2/l3 before stopping the reference clock. 5 on assertion of pe_rst_n, the 82574 transitions to dr state. 6 the system starts the pcie reference clock t pwrgd-clk before de-asserting pe_rst_n. 7 the internal pcie clock is valid and stable t ppg-clkint from pe_rst_n de-assertion. 8 the pcie internal pwrgd signal is asserted tclkpr after the external pe_rst_n signal. 9 asserting internal pcie pwrgd causes the nvm to be re-read, asserts phy reset, and disables wake up. 10 apm wake-up mode can be enabled based on what is read from the nvm. 11 after reading the nvm, phy reset is de-asserted. 12 link training starts after tpgtrn from pe_rst_n de-assertion. 13 a first pcie configuration access might arrive after t pgcfg from pe_rst_n de-assertion. 14 a first pci configuration response can be sent after tpgres from pe_rst_n de-assertion 15 writing a 1b to the memory access enable bit in the pci command register transitions the device from the d0u to d0 state.
82574 gbe controller?power management and delivery 90 5.5 wake up the 82574 supports two type s of wake-up mechanisms: ? advanced power management (apm) wake up ? pcie power management wake up the pcie power management wake up uses the pe_wake_n pin to wake the system up. the advanced power management wake up can be configured to use the pe_wake_n pin as well. 5.5.1 advanced power management wake up advanced power management wake up, or apm wake up, was previously known as wake on lan. it is a feature that has ex isted in the 10/100 mb/s nics for several generations. the basic premise is to receive a broadcast or unicast packet with an explicit data pattern, and then to assert a si gnal to wake up the system. in the earlier generations, this was accomplished by using special signal that ran across a cable to a defined connector on the motherboard. the nic would assert the signal for approximately 50 ms to signal a wake up. the 82574 uses (if configured to) an in-band pm_pme message for this. at power up, the 82574 reads the apm enable bits from the nvm initialization control word 2 into the apm enable (apme) bits of the wuc. these bits control enabling of apm wake up. when apm wake up is enabled, the 82574 checks all incoming packets for magic packets. see section 5.5.3.1.4 for a definition of magic packets. once the 82574 receives a matching wake-up packet, it: ?if the assert pme on apm wakeup (apmpme) bit is set in the wuc: ? sets the pme_status bit in the pmcsr and issues a pm_pme message (in some cases, this might require asserting the pe_wake_n signal first to resume power and clock to the pcie interface). ? stores the first 128 bytes of the packet in the wupm. ? sets the relevant received bit in the wus. the 82574 maintains the first wake-up packet received in the wupm until the software device driver writes a 1b to the magic packet received mag bit in the wus. note: the wupm latches on the first wake-up pa cket. subsequent wake-up packets are not saved until the programmer writes 1b to the re levant bit in the wus. the best course of action is to write a 1b to all of the wuc's bits, for example, set wuc = 0xffffffff. note: full power-on reset also clears the wuc. apm wake up is supported in all power states and only disabled if a subsequent nvm read results in the apm wake up bit being cleared or software explicitly writes a 0b to the apm wake up (apm) bit of the wuc register.
91 power management and delivery?82574 gbe controller 5.5.2 pcie power management wake up the 82574 supports pcie power management based wake ups. it can generate system wake-up events from three sources: ? reception of a magic packet*. ? reception of a network wake-up packet. ? detection of a link change of state. activating pcie power management wake up requires the following steps: ? the software device driver programs the wufc to indicate the packets it needs to use to indicate wake up and supplies the necessary data to the ipv4/v6 address table (ip4at, ip6at) and the flexible filt er mask table (ffmt), flexible filter length table (fflt), and the flexible filter value table (ffvt). it can also set the link status change wake up enable (lnkc) bit in the wufc to cause a wake up when the link changes state. ? the operating system (at configuration time) writes a 1b to the pme_en bit of the pmcsr (bit 8). normally, after enabling wake up, the operating system writes a 11b to the lower two bits of the pmcsr to put th e 82574 into a low-power mode. once wake up is enabled, the 82574 monito rs incoming packets, first filtering them according to its standard address filtering me thod, then filtering them with all of the enabled wake-up filters. if a packet passes both the standard address filtering and at least one of the enabled wake-up filters, the 82574: ? sets the pme_status bit in the pmcsr. ?if the pme_en bit in the pmcsr is set, asserts pe_wake_n. ? stores the first 128 bytes of the packet in the wpm. ? sets one or more of the received bits in the wus. (the 82574 set more than one bit if a packet matches more than one filter.) if enabled, a link state change wake up causes similar results, setting pme_status , asserting pe_wake_n and setting the link status changed (lnkc) bit in the wus when the link goes up or down. pe_wake_n remains asserted until the operat ing system either writes a b1 to the pme_status bit of the pmcsr or writes a 0b to the pme_en bit. after receiving a wake-up packet, the 82574 ignores any subsequent wake-up packets until the software device driver clears all of the received bits in the wus. it also ignores link change events until the software device driver clears the link status changed (lnkc) bit in the wus. 5.5.3 wake-up packets the 82574 supports various wake-up pa ckets using two types of filters: ? pre-defined filters ? flexible filters each of these filters are enabled if the co rresponding bit in the wufc is set to 1b.
82574 gbe controller?power management and delivery 92 5.5.3.1 pre-defined filters the following packets are supported by the 82574's pre-defined filters: ? directed packet (including exact, multicast indexed, and broadcast) ?magic packet* ? arp/ipv4 request packet ? directed ipv4 packet ? directed ipv6 packet each of these filters are enabled if the co rresponding bit in the wufc is set to 1b. the explanation of each filter includes a ta ble showing which bytes at which offsets are compared to determine if the packet passes the filter. both vlan frames and llc/snap can increase the given offsets if they are present. 5.5.3.1.1 directed exact packet the 82574 generates a wake-up event upon receipt of any packet whose destination address matches one of the 16 valid pr ogrammed receive addresses if the directed exact wake up enable bit is set in the wake up filter control register (wufc.ex). . 5.5.3.1.2 directed multicast packet for multicast packets, the upper bits of the incoming packet's destination address index a bit vector, the multicast table array that indicates whether to accept the packet. if the directed multicast wake up enable bit set in the wake up filter control register (wufc.mc) and the indexed bit in the vector is one then the 82574 generates a wake- up event. the exact bits used in the comp arison are programmed by software in the multicast offset field of the receive control register (rctl.mo). 5.5.3.1.3 broadcast if the broadcast wake up enable bit in the wake up filter control register (wufc.bc) is set, the 82574 generates a wake-up event when it receives a broadcast packet. offset # of bytes field value action comment 0 6 destination address compare match any pre- programmed address offset # of bytes field value action comment 0 6 destination address compare see above paragraph. offset # of bytes field value action comment 0 6 destination address 0xff*6 compare
93 power management and delivery?82574 gbe controller 5.5.3.1.4 magic packet* once the 82574 has been put into the magic packet* mode, it scans all incoming frames addressed to the node for a specif ic data sequence, which indicates to the controller that this is a magic packet* frame. a magic packet* frame must also meet the basic requirements for the lan technolo gy chosen, such as source address, destination address (which may be the receiving station's ieee address or a multicast address which includes the broa dcast address), and crc. the specific data sequence consists of 16 duplications of the ieee address of this node, with no breaks or interruptions. this sequence can be located anywhere within the packet, but must be preceded by a synchronization st ream. the synchronization stream enables the scanning state machine to be much simpler. the synchronization stream is defined as 6 bytes of 0xff. the 82574 also accepts a broadcast frame, as long as the 16 duplications of the ieee address match th e address of the machine to be awakened. the 82574 expects the destination address to either: 1. be the broadcast address (0xff.ff.ff.ff.ff.ff) 2. match the value in receive address register 0 (rah0, ral0). this is initially loaded from the nvm but might be changed by the software device driver. 3. match any other address filtering enabled by the software device driver. the 82574 searches for the contents of receive address register 0 (rah0, ral0) as the embedded ieee address. it considers any non-0xff byte after a series of at least 6 0xffs to be the start of the ieee address for comparison purposes. (that is it catches the case of 7 0xffs followed by the ieee address). as soon as one of the first 96 bytes after a string of 0xffs doesn't match, it cont inues to search for anther set of at least 6 0xffs followed by the 16 copies of the ieee address later in the packet. note: this definition precludes the first byte of the destination address from being 0xff. a magic packet's* destination address must match the address filtering enabled in the configuration registers with the exception that broadcast packets are considered to match even if the broadcast accept bit of the receive control register (rctl.bam) is 0b. if apm wakeup is enabled in the nvm, the 82574 starts up with the receive address register 0 (rah0, ral0) loaded from the nvm. this enables the 82574 to accept packets with the matching ieee ad dress before the software device driver comes up. offset # of bytes field value action comment 0 6 destination address compare mac header ? processed by main address filter 6 6 source address skip 12 8 possible llc/snap header skip 12 4 possible vlan tag skip 12 4 type skip any 6 synchronizing stream 0xff*6+ compare any+6 96 16 copies of node address a*16 compare compared to receive address register 0 (rah0, ral0)
82574 gbe controller?power management and delivery 94 accepting broadcast magic packets* for wake up purposes when the broadcast accept bit of the receive control register (rctl.bam) is 0b is a change from previous devices, which initialized rctl.bam to 1b if apm was enabled in the nvm, but then required that bit to be 1b to accept broadcast magic packets*, unless broadcast packets passed another perfect or multicast filter. 5.5.3.1.5 arp/ipv4 request packet the 82574 supports receiving arp request packets for wake up if the arp bit is set in the wufc. four ipv4 addresses are support ed, which are programmed in the ipv4 address table (ip4at). a successfully matched packet must contain a broadcast mac address, a protocol type of 0x0806, an arp opcode of 0x01, and one of the four programmed ipv4 addresses. the 82574 also handles arp request packets that have vlan tagging on both ethernet ii and ethernet snap types. 5.5.3.1.6 directed ipv4 packet the 82574 supports receiving directed ipv4 packets for wake up if the ipv4 bit is set in the wufc. four ipv4 addresses are support ed, which are programmed in the ipv4 address table (ip4at). a successfully matche d packet must contain the station's mac address, a protocol type of 0x0800, and one of the four programmed ipv4 addresses. the 82574 also handles directed ipv4 packets that have vlan tagging on both ethernet ii and ethernet snap types. offset # of bytes field value action comment 0 6 destination address compare mac header ? processed by main address filter 6 6 source address skip 12 8 possible llc/snap header skip 12 4possible vlan tag skip 12 2 type 0x0806 compare arp 14 2 hardware type 0x0001 compare 16 2 protocol type 0x0800 compare 18 1 hardware size 0x06 compare 19 1 protocol address length 0x04 compare 20 2 operation 0x0001 compare 22 6 sender hardware address - ignore 28 4 sender ip address - ignore 32 6 target hardware address - ignore 38 4 target ip address ip4at compare may match any of four values in ip4at
95 power management and delivery?82574 gbe controller 5.5.3.1.7 directed ipv6 packet the 82574 supports receiving directed ipv6 packets for wake up if the ipv6 bit is set in the wufc. one ipv6 address is supported and is programmed in the ipv6 address table (ip6at). a successfully matched packet mu st contain the station's mac address, a protocol type of 0x0800, and the programm ed ipv6 address. the 82574 also handles directed ipv6 packets that have vlan taggin g on both ethernet ii and ethernet snap types. offset # of bytes field value action comment 0 6 destination address compare mac header ? processed by main address filter 6 6 source address skip 12 8 possible llc/snap header skip 12 4possible vlan tag skip 12 2 type 0x0800 compare ip 14 1 version/ hdr length 0x4x compare check ipv4 15 1 type of service - ignore 16 2 packet length - ignore 18 2 identification - ignore 20 2 fragment information - ignore 22 1time to live - ignore 23 1 protocol - ignore 24 2 header checksum - ignore 26 4 source ip address - ignore 30 4 destination ip address ip4at compare may match any of four values in ip4at offset # of bytes field value action comment 0 6 destination address compare mac header ? processed by main address filter 6 6 source address skip 12 8 possible llc/snap header skip 12 4possible vlan tag skip 12 2 type 0x0800 compare ip 14 1 version/ priority 0x6x compare check ipv6 15 3flow label - ignore
82574 gbe controller?power management and delivery 96 5.5.3.2 flexible filter the 82574 supports four flexible filters for ho st wake up and two flexible filters for tco wake up. for more details refer to section 10.2.8.2 . each filter can be configured to recognize any arbitrary pattern within the fi rst 128 bytes of the packet. to configure the flexible filter, software programs: ? the mask values into the flexible filter mask table (ffmt) ? the required values into the flexible filter value table (ffvt) ? the minimum packet length into the flexible filter length table (fflt). these contain separate values for each filter. software must also: ? enable the filter in the wufc. ? enable the overall wake-up functionality by setting pme_en in the pmcsr or wuc. once enabled, the flexible filters scan inco ming packets for a match. if the filter encounters any byte in the packet where the mask bit is one and the byte doesn't match the byte programmed in ffvt, then the filter failed that packet. if the filter reaches the required length without failin g the packet, it passes the packet and generates a wake-up event. it ignores any ma sk bits set to one beyond the required length. the following packets are listed for reference purposes only. the flexible filter could be used to filter these packets. 5.5.3.2.1 ipx diagnostic responder request packet an ipx diagnostic responder request pack et must contain a valid mac address, a protocol type of 0x8137, and an ipx diagnostic socket of 0x0456. it may include llc/ snap headers and vlan tags. since filtering th is packet relies on the flexible filters, which use offsets specified by the operating system directly, the operating system must account for the extra offset ll c/snap headers and vlan tags. 18 2 payload length - ignore 20 1 next header - ignore 21 1 hop limit - ignore 22 16 source ip address - ignore 38 16 destination ip address ip6at compare match value in ip6at offset # of bytes field value action comment offset # of bytes field value action comment 0 6 destination address compare 6 6 source address skip 12 8 possible llc/snap header skip 12 4 possible vlan tag skip
97 power management and delivery?82574 gbe controller 5.5.3.2.2 directed ipx packet a valid directed ipx packet contains: ? the station's mac address. ? a protocol type of 0x8137. ? an ipx node address that equals the station's mac address. it might also include llc/snap headers an d vlan tags. since filtering this packet relies on the flexible filters, which use offsets specified by the operating system directly, the operating system must account for the extra offset llc/snap headers and vlan tags. 5.5.3.2.3 ipv6 neighbor discovery filter in ippv6, a neighbor discovery packet is used for address resolution. a flexible filter can be used to check for a neighborhood discovery packet. 5.5.3.3 wake-up packet storage the 82574 saves the first 128 bytes of the wake-up packet in its internal buffer, which can be read through the wupm after the system wakes up. 12 2 type 0x8137 compare ipx 14 16 typical ipx information - ignore 30 2 ipx diagnostic socket 0x0456 compare offset # of bytes field value action comment offset # of bytes field value action comment 0 6 destination address compare mac header ? processed by main address filter 6 6 source address skip 12 8 possible llc/snap header skip 12 4 possible vlan tag skip 12 2 type 0x8137 compare ipx 14 10 typical ipx information - ignore 24 6ipx node address receive address 0 compare must match receive address 0
82574 gbe controller?non-vola tile memory (nvm) map 98 6.0 non-volatile memory (nvm) map the nvm contains two regions located at fixed addresses and various regions located at programmable addresses throughout the physical nvm space. the nvm base area resides at word addresse s 0x00-0x3f. all define d fields are fixed, while reserved words might be used by some programmable areas. the base area is present in the nvm in all system configurations. the programmable areas are as follows: ? additional configuration for the phy is lo cated in the extended configuration area. the extended configuration pointer indi cates the location of the extended configuration area. a value of 0x0000 means that the extended configuration area is disabled. this should be the case for the 82574. ? manageability configuration is located in a separate area. the manageability pointer indicates the location of that area. a value of 0x0000 means that the manageability configuration area is disabled. note: the nvm image must fit the specific nvm part being used. special attention should be paid to nvm words and fields that vary, like the examples of nvmtype or nvsize. for the latest 82574 nvm images, contact your intel representative. 6.1 eeupdate intel has an ms-dos* software utility called eeupdate that can be used to program eeprom images in development or production -line environments. to obtain a copy of this program, contact your intel representative. 6.2 basic configuration table ta b l e 3 4 lists the nvm map for the 0x00-0x3f address range: table 34. nvm map of address range 0x00-0x3f word used by 15 8 7 0 0x00 0x01 0x02 hw hw hw ethernet address byte 2 ethernet address byte 4 ethernet address byte 6 ethernet address byte 1 ethernet address byte 3 ethernet address byte 5 0x03 0x04 0x05 0x06 0x07h sw compatibility high compatibility low 0x08 0x09 sw pba, byte 1 pba, byte 3 pba, byte 2 pba, byte 4 0x0a hw init control 1
99 non-volatile memory (nvm) map?82574 gbe controller 0x0b hw subsystem id 0x0c hw subsystem vendor id 0x0d hw device id 0x0e hw reserved 0x0f hw init control 2 0x10 hw nvm word 0 0x11 hw nvm word 1 0x12 hw nvm word 2 0x13 hw reserved 0x14 hw reserved 0x15 hw reserved 0x16 hw reserved 0x17 hw pcie electrical idle delay 0x18 hw pcie init configuration 1 0x19 hw pcie init configuration 2 0x1a hw pcie init configuration 3 0x1b hw pcie control 0x1c hw phy configuration ledctl 1 0x1d hw reserved 0x1e hw device rev id 0x1f hw ledctl 0 2 0x20 hw flash parameters 0x21 hw flash lan address 0x22 hw lan power consumption 0x23 sw sw flash vendor detection 0x24 hw init control 3 0x25 hw apt smbus address 0x26 hw apt rx enable parameters 0x27 hw apt smbus control 0x28 hw apt init flags 0x29 hw apt management configuration 0x2a hw apt ? code pointer 0x2b hw least significant word of firmware id 0x2c hw most significant word of firmware id 0x2d hw nc-si management configuration 0x2e hw nc-si configuration 0x2f hw vpd pointer 0x30-0x3e sw sw section 0x3f sw software checksum, words 0x00 through 0x3f word used by 15 8 7 0
82574 gbe controller?non-vola tile memory (nvm) map 100 6.2.1 hardware accessed words this section describes the nvm words that are loaded by the 82574 hardware. 6.2.1.1 ethernet address (words 0x00-0x02) the ethernet individual address (ia) is a 6-byte field that must be unique for each network interface card (nic), and thus unique for each copy of the nvm image. the first three bytes are vendor specific - for example, the ia is equal to [00 aa 00] or [00 a0 c9] for intel products. the value from this field is loaded into the receive address register 0 (ral0/rah0). for the purpose of this specification, the ia byte numbering convention is indicated below: 6.2.1.2 compatibility bytes (word 0x03) ia byte / value vendor 1 2 3 4 5 6 intel original 00 aa 00 variable variable variable intel new 00 a0 c9 variable variable variable bit name default description 15:13 reserved 000b reserved. must be set to 0. 12 asf smbus connected 0b asf smbus connected 0b = not connected. 1b = connected. 11 lom 0b lom or nic 0b = nic. 1b = lom. 10 server nic 1b server nic 0b = client. 1b = server. 9client nic1b client nic 0b = server. 1b = client. 8 retail card 0b retail card 0b = retail. 1b = oem. 7:6 reserved 00b reserved. must be set to 00b. 5 reserved 1b reserved. must be set to 1b. 4 smbus connected 1b smbus connected 0b = not connected. 1b = connected.
101 non-volatile memory (nvm) map?82574 gbe controller 6.2.1.3 oem led configuration (word 0x04) 6.2.1.4 initialization cont rol word 1 (word 0x0a) bit name default description 3 reserved 0b reserved. must be set to 0b. 2pci bridge1b pci bridge not present 0b = pci bridge not present. 1b = pci bridge present. 1:0 reserved 00b reserved. must be set to 00b. bit name default description 15:12 reserved 0xf reserved. 11:8 led 2 control 0x7 control for led 2 - link_1000. 7:4 led 1control 0x4 control for led 1 - link/activity. 3:0 led 0 control 0x6 control for led 0 - link_100. bit name default description 15 reserved 0b reserved. 14 reserved 0b reserved 13:12 reserved 00b reserved. 11 frcspd 1b default setting for the force speed bit in the device control register (ctrl[11]). the hardware default value is 1b. 10 fd 1b default setting for duplex setting. mapped to ctrl[0]. the hardware default value is 1b. 9 reserved 1b reserved. 8 reserved 0b reserved. 7 reserved 0b must be set to 0b (pcie cb). 6 reserved 1b reserved 5 reserved 1b reserved. 4ilos 0b default setting for the loss-of-signal polarity setting for ctrl[7]. the hardware default value is 0b. 3 reserved 1b reserved 2 reserved 0b reserved 1 load subsystem ids 1b this bit, when equal to 1b, indicates that the device is to load its pcie subsystem id and subsystem vendor id from the nvm (words 0x0b and 0x0c). 0 load device id 1b this bit, when equal to 1b, indicates that the device is to load its pcie device id from the nvm (word 0x0d).
82574 gbe controller?non-vola tile memory (nvm) map 102 6.2.1.5 subsystem id (word 0x0b) if the load subsystem ids in word 0x0a is set, this word is loaded to initialize the subsystem id. the default value is 0x0. 6.2.1.6 subsystem vendor id (word 0x0c) if the load subsystem ids in word 0x0a is set, this word is loaded to initialize the subsystem vendor id. the default value is 0x8086. 6.2.1.7 device id (word 0x0d) if the load vendor/device ids in word 0x0a is set, this word is loaded to initialize the device id of the function. the default value is 0x10d3 for the 82574. 6.2.1.8 initialization control word 2 (word 0x0f) bit name default description 15 apm pme# enable 0b initial value of the assert pme on apm wakeup bit in the wake up control register (wuc.apmpme). 14:13 mngm 00b manageability operation mode using this field selects one of the manageability operation modes. 00b = manageability disable (clock gated). 01b = nc-si. 10b = advanced pass through. 11b = reserved. 12 nvmtype 0b 0b = eeprom. 1b = flash. 11:8 nvsize 0000b nvm size [bytes] equals 128 * 2 ** nvsize. (when nv m=flash the nvsize should be >= 9 ?. therefore, the minimal supported flash size is 64 kb). note: a value of 1111b is reserved. following are all possible nvsize values and their corresponding nvm sizes (in both bytes and bits): 0000b = 128 b / 1 kb 0001b = 256 b / 2 kb 0010b = 0.5 kb / 4 kb 0011b = 1 kb / 8 kb 0100b = 2 kb / 16 kb 0101b = 4 kb / 32 kb 0110b = 8 kb / 64 kb 0111b = 16 kb / 128 kb 1000b = 32 kb / 256 kb 1001b = 64 kb / 0.5 mb 1010b = 128 kb / 1 mb 1011b = 265 kb / 2 mb 1100b = 0.5 mb / 4 mb 1101b = 1 mb / 8 mb 1110b = 2 mb / 16 mb 1111b = reserved 7 reserved 0b reserved
103 non-volatile memory (nvm) map?82574 gbe controller 6.2.1.9 nvm protected word 0 - nvp0 (word 0x10) 6.2.1.10 nvm protected word 1 - nvp1 (word 0x11) 6.2.1.11 nvm protected word 2 - nvp2 (word 0x12) bit name default description 6 reserved 1b reserved 5 reserved 0b reserved 4 reserved 1b reserved 3 reserved 1b reserved 1 reserved 0b reserved 0 reserved 0b reserved bit name default description 15:8 reserved 0x0 reserved 7:0 reserved 0x0 reserved bit name default description 15:8 fsecer 0xff defines the instruction code for the block erase used by the 82574. the erase block size is defined by the secsize field in address 0x12. 7:1 reserved 0x00 reserved 0 ram_pwr_ save_en 1b when set to 1b, enables reducing power consumption by clock gating the 82574 rams. bit name default description 15:8 sign 0x7e signature the 8-bit signature field indicates to the device that there is a valid nvm present. if the signature field does not equal 0x7e then the default values are used for the device configuration. 7 reserved 0b reserved 6 reserved 0b reserved 5 reserved 0b reserved 4 reserved 0b reserved 3:2 secsize 01b the secsize defines the flash sector erase size as follows: 00b = 256 bytes. 01b = 4 kb. 10b = reserved. 11b = reserved. 1:0 reserved 0b reserved
82574 gbe controller?non-vola tile memory (nvm) map 104 6.2.1.12 extended configuration word 1 (word 0x14) 6.2.1.13 extended configuration word 2 (word 0x15) 6.2.1.14 extended configuration word 3 (word 0x16) 6.2.1.15 pcie electrical idle delay (word 0x17) bit name default description 15:13 reserved 0x0 reserved 12 reserved 0b reserved 11:0 reserved 0x0 reserved bit name default description 15:8 reserved 0x0 reserved 7 reserved 1b reserved 6 reserved 0b reserved 5 reserved 1b reserved 4 reserved 0b reserved 3 reserved 1b reserved 2 reserved 0b reserved 1 reserved 0b reserved 0 reserved 0b reserved bit name default description 15:8 reserved 0x0 reserved 7:0 reserved 0x0 reserved bit name default description 15:14 reserved 0x0 reserved 13 reserved 0b reserved 12:8 reserved 0x7 reserved 7:3 reserved 0x0 reserved 2 reserved 1b reserved 1 reserved 0b reserved 0 reserved 0b reserved
105 non-volatile memory (nvm) map?82574 gbe controller 6.2.1.16 pcie init configur ation 1 word (word 0x18) 6.2.1.17 pcie init configur ation 2 word (word 0x19) bit name default description 15 reserved 0b reserved 14:12 l1_act_ext_latency 110b (32 ? s- 64 ? s) l1 active exit latency for the configuration space. 11:9 l1_act_acc_latency 110b (32 ? s- 64 ? s) l1 active acceptable latency for the configuration space. 8:6 l0s_acc_latency 011b (512ns) l0s acceptable latency for the configuration space. 5:3 l0s_se_ext_latency 001b l0s exit latency for active state power management (separated reference clock) ? (latency between 64 ns ? 128 ns). 2:0 l0s_co_ext_latency 001b l0s exit latency for active state power management (common reference clock) ? (latency between 64 ns ? 128 ns). bit name default description 15 dllp timer enable 0b when set, enables the dllp timer counter. 14 reserved 0b reserved 13 reserved 1b reserved 12 ser_en 0b when set to 1b, the serial number capability is enabled. 11:8 extranfts 0x1 extra nfts (number of fast training signal), which is added to the original requested number of nfts (as requested by the upstream component). 7:0 nfts 0x50 number of special sequence for l0s transition to l0.
82574 gbe controller?non-vola tile memory (nvm) map 106 6.2.1.18 pcie init configuration 3 word (word 0x1a) bit name default description 15 master_enable 0b when set to 1b, this bit enables the phy to be a master (upstream component/cross link functionality). 14 scram_dis 0b scrambling disable when set to 1b, this bit disabl es the pcie lfsr scrambling. 13 ack_nak_sch 0b ack/nak scheme 0b = scheduled for transm ission following any tlp. 1b = scheduled for transmission acco rding to time outs specified in the pcie specification. 12 cache_lsize 0b cache line size 0b = 64 bytes. 1b = 128 bytes. note: the value loaded must be equal to the actual cache line size used by the platform, as configured by system software. 11:10 pcie_cap 01b pcie capability version 9io_sup 1b i/o support (effect i/o bar request) 0b = i/o is not supported. 1b = i/o is supported. 8packet_size 1b default packet size 0b = 128 bytes. 1b = 256 bytes. 7 reserved 0b reserved 6 reserved 0b reserved 5 reserved 0b reserved 4 reserved 0b reserved 3:2 act_stat_pm_sup 11b determines support for active state link power management (aslpm). loaded into the pcie active state link pm support register. 1 slot_clock_cfg 1b when set, the 82574 uses the pcie reference clock supplied on the connector (for add-in solutions). 0 loop back polarity inversion 0b check polarity inversion in loop-back master entry during normal operation polarity is adjusted during link up. when this bit is set, the receiver re-che cks the polarity of rx-data and then inverts it accordingly, when entering a near-end loopback. when cleared, polarity is not re-checked after link up.
107 non-volatile memory (nvm) map?82574 gbe controller 6.2.1.19 pcie control (word 0x1b) bit name default description 1:0 latency_to_e nter_l1 11b period in l0s state before transitioning into an l1 state bits [1:0]. 00b = 64 ? s. 01b = 256 ? s. 10b = 1 ms. 11b = 4 ms. 2 electrical idle 0b electrical idle mask if set to 1b, disables the check for ille gal electrical idle sequence (such as, eidle ordered set without common mode and vise versa), and accepts any of them as the correct eidle sequence. note: the specification can be interpre ted so that idle ordered set is sufficient for transition to power mana gement states. the use of this bit allows an acceptance of such interpre tation and avoids the possibility of correct behavior to be understood as illegal sequences. 3 reserved 0b reserved 4 skip disable 0b disable skip symbol insertion in the elastic buffer. 5 l2 disable 0b disable the link from entering l2 state. 6 reserved 0b reserved 9:7 msi_x_num 2b this field specifies the number of entries in the msi-x tables. msi_x_num is equal to the number of entries minus one. for example, a value of 0x3 means four vectors are available. the 82574 supports a maximum of five vectors. 10 leaky bucket disable 1b disable leaky bucket mechanism in the pcie phy. disabling this mechanism holds the link from going to recovery retrain in case of disparity errors. 11 good recovery 0b when this bit is set, the ltssm reco very states always progress towards link up (force a good recovery when a recovery occurs). 12 pcie_ltssm 0b when cleared, ltssm complies with the slimpipe specification (power mode transition). when set, ltssm behaves as in previous generations. 13 pcie down reset disable 0b disable a core reset when the pcie link goes down. 14 latency_to_e nter_l1 1b msb [2] of period in l0s state before transitioning into an l1 state (lower bits are in bits [1:0]. recommended setting: {14, 1:0} = 011b ? 32 ? s. 15 pcie_rx_ valid 0b force receiver presence detection. when set, the 82574 overrides the receiver (partner) detection status.
82574 gbe controller?non-vola tile memory (nvm) map 108 6.2.1.20 led 1 configuration defaults/phy configuration (word 0x1c) 6.2.1.21 device rev id (word 0x1e) bit name default description 3:0 led1 mode 0x0 initial value of the led1_mode field specifying what event/state/pattern is displayed on the led1 (activit y) output. a value of 0011b (0x3) indicates the activity state. 4 reserved 0b reserved, set to 0b. 5 led1 blink mode 0b led1 blink mode 0b = blinks at 200 ms on and 200 ms off. 1b = blinks at 83 ms on and 83 ms off. 6 led1 invert 0b initial value of led1_ivrt field 0b = active-low output 7 led1 blink 1b initial value of led1_blink field 0b = non-blinking 8 reserved 1b reserved 9d0lplu0b d0 low power link up enables decrease in link speed in d0a state when the power policy and power management state dictate so. 10 lplu 1b low power link up enables decrease in link speed in non -d0a states when the power policy and power management state dictate so. 11 disable 1000 in non-d0a 1b disables 1000 mb/s operation in non-d0a states. 12 class ab 0b when set, the phy operates in class a mode instead of class b mode. this mode only applies for 1000base-t op eration. 10base-t and 100base-t operation continue to run in class b mode by default, regardless of this signal value. 13 reserved 1b reserved 14 giga disable 0b when set, 1000 mb/s operation is disabled in all power modes. 15 reserved 0b reserved bit name default description 15 reserved 0b reserved 14 reserved 1b reserved 13 reserved 0b reserved 12 reserved 0b reserved 11 reserved 0b reserved 10 reserved 0b reserved 9 reserved 1b reserved 8 reserved 1b reserved 7:0 reserved 0x0 reserved
109 non-volatile memory (nvm) map?82574 gbe controller 6.2.1.22 led 0-2 configuration defaults (word 0x1f) 6.2.1.23 flash parameters - flpar (word 0x20) bit name default description 3:0 led0 mode 0x0 initial value of the led0_mode field specifying what event/state/pattern is displayed on the led0 (link_up) output. a value of 0010b (0x2) causes this to indicate link_up state. 4 reserved 0b reserved, set to 0b. 5 led0 blink mode 0b 1 1. these bits are read from the nvm. led0 blink mode 0b = blinks at 200 ms on and 200 ms off. 1b = blinks at 83 ms on and 83 ms off. 6led0 invert0b initial value of led0_ivrt field 0b = active-low output. 7 led0 blink 0b initial value of led0_blink field 0b = non-blinking. 11:8 led2 mode 0x0 initial value of the led2_mode field specifying what event/state/pattern is displayed on led2 (link_100) output. a value of 0110b (0x6) causes this to indicate 100 mb/s operation. 12 reserved 0b reserved, set to 0b. 13 led2 blink mode 0b 1 led2 blink mode 0b = blinks at 200 ms on and 200 ms off. 1b = blinks at 83 ms on and 83 ms off. 14 led2 invert 0b initial value of led2_ivrt field 0b = active-low output. 15 led2 blink 0b initial value of led2_blink field 0b = non-blinking. bit name default description 15:8 fdever 0xff defines the instruction code for the flash device erase. a value of 0x00 means that the device does no t support the device erase. 7:6 reserved 0x0 reserved 5 flsstn 0b sst flash not when set to 0b, indicates an sst flash type: write access to the flash is limited to 1 byte at a time and it is required to clear write protection at power up. when set to 1b, burst write ac cess to the flash is enabled up to 256 bytes and it is not required to clear write protection at power up. 4longc 0b very long cycle indication when set to 1b, the longc indicates to the 82574 that a flash write instruction is considered a very long instruction. when set to '0b, the longc indicates that a write cycle to the flash is not considered a very long cycle. 3:0 reserved 0x0 reserved
82574 gbe controller?non-vola tile memory (nvm) map 110 6.2.1.24 flash lan address - flanadd (word 0x21) 6.2.1.25 lan power consumption (word 0x22) 6.2.1.26 flash software detection word (word 0x23) the setting of this word to 0xffff enables detection of the flash vendor by software tools. bit name default description 15 dislfb 0b 1b = disables the lan flash bar. 14:12 lansize 0x0 lan boot expansion wi ndow size = 2 kb * 2 ** lansize. 11:8 lbadd 0x0 lan flash address defines the location of the lan boot expansion rom in the physical flash device as defined in the following equation: word address = 4 kb * (lbadd + pend). 7 dislexp 0b 1b = disables the lan expansion boot rom bar. 6:1 reserved 0x0 reserved, must be set to 0b. 0 reserved 0b reserved, must be set to 0b. bit name default description 15:8 lan d0 power 0xf the value in this field is reflected in the pci power management data register of the function for d0 power consumption and dissipation ( data_select = 0 or 4). power is defined in 100 mw units. the power also includes the external logic required for the lan function. 7:5 reserved 0x0 reserved 4:0 lan d3 power 0x4 the value in this field is reflected in the pci power management data register of the function for d3 power consumption and dissipation ( data_select = 3 or 7). power is defined in 100 mw units. the power also includes the external logic required for the function. the most significant bits in the data register that reflec ts the power values are padded with zeros. bit name default description 15 checksum validity 0x0 checksum validity indication 0b = checksum should be corrected by software tools. 1b = checksum may be considered valid. 14 deep smart power down 0x1 enable/disable bit for deep smart power down functionality. 0b = enable deep smart power down (dspd). 1b = disable dspd (default). 13:8 reserved 0xff reserved 7:0 flash vendor detect 0xff this word must be set to 0xff.
111 non-volatile memory (nvm) map?82574 gbe controller 6.2.1.27 initialization control 3 (word 0x24) 6.2.2 software accessed words 6.2.2.1 compatibility fiel ds (words 0x03 - 0x07) five words in the nvm image are reserved for compatibility information. new bits within these fields can be defined as the need arises for determining software compatibility between various hardware revisions. 6.2.2.2 pba number (word 0x08 and 0x09) the nine-digit printed board assembly (pba) number used for intel manufactured network interface cards (nics) are stored in a 4-byte field. the dash itself is not stored, neither is the first digit of the 3-digit suffix, as it is always zero for the affected products. note that through the course of ha rdware ecos, the suffix field (byte 4) is incremented. the purpose of this informatio n is to enable customer support (or any user) to identify the exact revision level of a product. network driver software should not rely on this field to identify the product or its capabilities. 6.2.2.3 pxe words (words 0x30h:0x3e) words 0x30 through 0x3e are reserved for so ftware and are used by iba/pxe software. bit name default description 15 reserved 0b reserved 14 reserved 1b reserved 13 reserved 1b reserved 12 reserved 0b reserved 11 reserved 1b reserved 10 apm enable 0b initial value of advanced power management wake up enable in the wake up control (wuc.apme) register. mapped to ctrl[6] and to wuc[0]. 9 reserved 0b reserved 8 reserved 0b reserved 7:1 reserved 0x0 reserved 0 no_phy_rst 1b no phy reset when set to 1b, this bit prevents the phy reset signal and the power changes reflected to the phy according to the manc.keep_phy_link_up value. product pwa number byte 1 byte 2 byte 3 byte 4 example 123456-003 12 34 56 03
82574 gbe controller?non-vola tile memory (nvm) map 112 6.2.2.4 iscsi boot configurat ion start address (word 0x3d) 6.2.2.5 checksum word calculation (word 0x3f) the checksum word (0x3f) is used to ensure that the base nvm image is a valid image. the value of this word should be calculated such that after addi ng all the words (0x00- 0x3f), including the checksum word itself, the sum should be 0xbaba. the initial value in the 16-bit summing register should be 0x0000 and the carry bit should be ignored after each addition. note: hardware does not calculate the word 0x3f checksum during an nvm write or read. it must be calculated by software independentl y and included in the nvm write data. this field is provided strictly for software verification of nvm validity. all hardware configuration based on word 0x00-0x3f content is based on the validity of the signature field of the nvm. 6.3 manageability configuration words 6.3.1 smbus apt configuration words 6.3.1.1 apt smbus address (word 0x25) 6.3.1.2 apt rx enable parameters (word 0x26) bit name default description 15:0 address 0x0 nvm word address of the iscsi boot configuration structure starting point. bit name default description 15:9 smbus address 0x0 defines the default smbus address. 8 reserved 0b reserved 7:1 mc smbus address 0x0 management controller (mc) smbus target address. 0 reserved 0b reserved bit name default description 15:0 alert value 0x0 rx enable byte 14 (alert value). 7:0 interface data 0x0 rx enable byte 13 (interface data).
113 non-volatile memory (nvm) map?82574 gbe controller 6.3.1.3 apt smbus control (word 0x27) 6.3.1.4 apt init flags (word 0x28) 6.3.1.5 apt management configuration (word 0x29) bit name default description 15:8 smbus fragment size 0x20 defines the largest smbus fragment th at can be generated by the 82574. the 82574 does not generate an smbus fragment containing more than (smbus_fragment_size + 1) bytes. the value of this field must be dword aligned. bits 9:8 must be set to 00b. 7:0 notification timeout 0x0 smbus notification timeout. each unit adds 1.1 to 1.3 ms. resolution depends on internal clock, which might vary its frequency in di fferent power saving modes. bit name description 15:6 reserved reserved, set to 0x0. 5 reserved reserved 4force tco enable 1b = enable internal reset on force tco command. 0b = force tco command has no impact on the 82574. 3smb arp disabled 1b = the 82574 does not support smbus arp functionality. 0b = the 82574 supports smbus arp functionality. 2 smb block read command this bit defines the block read sm bus command that should be used: 0b = smbus block read command is 0xc0. 1b = smbus block read command is 0xd0. 1:0 notification method 00b = smbus alert. 01b = asynchronous notify. 10b = direct receive. 11b = reserved. bit name description 15:14 reserved reserved, set to 0b. 13:4 code size size of the manageability code in dwords. 3:2 reserved reserved, set to 0b. 1:0 ram partitioning 00b = tx 2 kb, rx 6 kb, rest 4 kb. 01b = tx 2 kb, rx 7 kb, rest 3 kb. 10b = tx 2 kb, rx 8 kb, rest 2 kb. 11b = tx 2 kb, rx 9 kb, rest 1 kb.
82574 gbe controller?non-vola tile memory (nvm) map 114 6.3.1.6 apt ? code pointer (word 0x2a) note: apt code size and pointer should be configured such that the code does not cross the 4 kb boundary. 6.3.2 nc-si configuration words 6.3.2.1 least significant (ls) word of the firmware id (word 0x2b) 6.3.2.2 most significant (ms) word of the firmware id (word 0x2c) 6.3.2.3 nc-si management configuration (word 0x2d) bit name description 15:12 reserved reserved, set to 0b. 11:0 pointer word pointer to the start of the management firmware ? code in the nvm. for example, a value of 0x100 indicates the firmware ? code starts at nvm word address 0x100. 1 1. ? code in the nvm is organized such that the lower word of a dword code, is stored first. bit name description 15:0 firmware id firmware revision ls word. bit name description 15:0 firmware id firmware revision ms word. bit name description 15:14 reserved reserved, set to 0b. 13:4 code size size of the mng code in dwords. 3:2 reserved reserved, set to 0b. 1:0 ram partitioning 00b = tx 4 kb, rx 4 kb, rest 4 kb. 01b = tx 4 kb, rx 5 kb, rest 3 kb. 10b = tx 4 kb, rx 6 kb, rest 2 kb. 11b = tx 4 kb, rx 7 kb, rest 1 kb.
115 non-volatile memory (nvm) map?82574 gbe controller 6.3.2.4 nc-si configuration (word 0x2e) note: nc-si code size and pointer should be configured such that the code does not cross the 4 kb boundary. bit name description 15 reserved reserved, set to 0b. 14:12 package id ncsi package id. 11:0 ? code pointer word pointer to the start of the management firmware ? code in the nvm. for example, a value of 0x100 indicates the firmware ? code starts at nvm word address 0x100. 1 1. ? code in the nvm is organized such that the lower word of a dword code is stored first.
82574 gbe controller?inline functions 116 7.0 inline functions 7.1 packet reception packet reception consists of recognizing the presence of a packet on the wire, performing address filtering, storing the pack et in the receive data fifo, transferring the data to one of the two receive queues in host memory, and updating the state of a receive descriptor. note: the maximum supported received packet size is 16383 bytes. 7.1.1 packet address filtering hardware stores incoming packets in host memory subject to the following filter modes. if there is insufficient space in the receive fifo, hardware drops them and indicates the missed packet in the appropriate statistics registers. the following filter modes are supported: ? exact unicast/multicast ? the destination address must exactly ma tch one of 16 stored addresses. these addresses can be unicast or multicast. note: the software device driver can only use 15 entries (entries 0-14). entry 15 should be kept untouched by the software device driver. it can be used only by manageability's firmware or an external manageability controller (mc). ? promiscuous unicast ? receive all unicasts ?multicast the upper bits of the incoming packet's destination address index is a bit vector that indicates whether to accept the packet; if the bit in the vector is one, accept the packet, otherwise, reject it. the 82574 prov ides a 4096-bit vector. software provides four choices of which bits are used for indexing. these are [47:36], [46:35], [45:34], or [43:32] of the internally stored repres entation of the destination address (see figure 61 ) ? promiscuous multicast ? receive all multicast packets ?vlan receive all vlan packets that are for this station and have the appropriate bit set in the vlan filter table. a detailed discussion and explanation of vlan packet filtering is contained in section 7.5.3 . normally, only good packets are received.
117 inline functions?82574 gbe controller good packets are defined as those packets with no: ? crc error ? symbol error ? sequence error ? length error ? alignment error ? where carrier extension or rx_err errors are detected. however, if the store-bad-packet bit is set in the device control register (rctl.sbp), then bad packets that pass the filter function are stored in host memory. packet errors are indicated by error bits in the receive descriptor (rdesc.errors). it is possible to receive all packets, regardless of whether they are bad, by setting the promiscuous enables and the store-bad-packet bit. note: crc errors before the sfd are ignored. ev ery packet must have a valid sfd (rx_dv with no rx_er in the gmii/mii interface) in order to be recognized by the device (even bad packets). 7.1.2 receive data storage memory buffers pointed to by descriptors store packet data. hardware supports the following receive buffer sizes: ? 256b 512b 1024b 2048b 4096b 8192b 16384b ? flxbuf x 1024b while flxbuf=1,2,3,?15 buffer size is selected by bit settings in the receive control register (rctl.bsize, rctl.bsex, rctl.dtyp and rctl. flxbuf). the 82574 (in legacy mode) places no alignment restrictions on receive memory buffer addresses. this is desirable in situations where the receive buffer was allocated by higher layers in the networking software stack, as these higher layers might have no knowledge of a specific device's buffer alignment requirements. note: although alignment is completely unrestricted, it is highly recommended that software allocate receive buffers on at least cache-line boundaries whenever possible. 7.1.3 legacy receive descriptor format a receive descriptor is a data structure that contains the receive data buffer address and fields for hardware to store packet info rmation. if the rfctl.exsten bit is clear and the rctl.dtyp equals 00b, the 82574 uses the legacy rx descriptor as shown in the following figure.
82574 gbe controller?inline functions 118 figure 23. 82574 legacy rx descriptor 7.1.3.1 length field (16-bit, offset 0) upon receipt of a packet for this device, hardware stores the packet data into the indicated buffer and writes the length, packet checksum , status , errors , and status fields. length covers the data written to a receive buffer including crc bytes (if any). note: software must read multiple descriptors to determine the complete length for packets that span multiple receive buffers. 7.1.3.2 packet checksum (16-bit, offset 16) for standard 802.3 packets (non-vlan) the packet checksum is by default computed over the entire packet from the first byte of the da through the last byte of the crc, including the ethernet and ip headers. softwa re can modify the starting offset for the packet checksum calculation via the rece ive checksum control register (rxcsum). this register is described in section 10.2.5.15 . to verify the tcp/udp checksum using the packet checksum, software must adjust the packet checksum value to back out the bytes that are not part of the true tcp checksum. when operating with the legacy rx descriptor, the rxcsum.ippcs e and the rxcsum.pcsd should be cleared (the default value). for packets with vlan header the packet checksum includes the header if vlan striping is not enabled by the ctrl.vme. if vlan header strip is enabled, the packet checksum and the starting offset of the packet checksum exclude the vlan header. 7.1.3.3 status field (8-bit, offset 32) status information indicates whether the descriptor has been used and whether the referenced buffer is the last one for the packet. figure 24. receive status (rdesc.status-0) layout rsvd (bit 7) - reserved ipcs (bit 6) - ipv4 checksum calculated on packet 63 48 47 40 39 32 31 16 15 0 0 buffer address [63:0] 8 vlan tag errors status packet checksum 1 1. the checksum indicated here is the unadjusted 16-bit ones complement of the packet. a software assist might be required to back out appropriate information prior to send ing it up to upper software layers. the packet checksum is always reported in the first descriptor (even in the case of multi-descriptor packets). length 7 6 5 4 3 2 1 0 rsvd ipcs tcpcs udpcs vp rsvd eop dd
119 inline functions?82574 gbe controller tcpcs (bit 5) - tcp checksum calculated on packet udpcs (bit 4) - udp checksum calculated on packet vp (bit 3) - packet is 802.1q (matched vet) reserved (bit 2) - reserved eop (bit 1) - end of packet dd (bit 0) - descriptor done eop: packets that exceed the receive buffer size spans multiple receive buffers. eop indicates whether this is the last buffer for an incoming packet. dd indicates whether hardware is done with the descriptor. when the dd bit is set along with eop , the received packet is completely in main memo ry. software can determine buffer usage by setting the status byte to ze ro before making the descriptor available to hardware, and checking it for non-zero content at a later time. for multi-descriptor packets, packet status is provided in the fi nal descriptor of the packet ( eop set). if eop is not set for a descriptor, only the address , length , and dd bits are valid. vp: the vp field indicates whether the incoming packet's type matches vet (for example, if the packet is a vlan (802.1q) ty pe). it is set if the packet type matches vet and ctrl.vme is set. for a further description of 802.1q vlans, see section 7.5 . ipcs tcpcs udpcs: these bit descriptions are listed in the following table: ipv6 packets do not have the ipcs bit set, but might have the tcpcs bit set if the 82574 recognized the tcp or udp packet. 7.1.3.4 error field (8-bit, offset 40) most error information appears only when the store-bad-packet bit (rctl.sbp) is set and a bad packet is received. figure 25 shows the definition of the possible errors and their bit positions. figure 25. receive errors (rdesc.errors) layout rxe (bit 7) - rx data error ipe (bit 6) - ipv4 checksum error tcpcs udpcs ipcs functionality 0b 0b 0b hardware does not provide checksum offload. 1b 0b 1b/0b hardware provides ipv4 checksum offload if ipcs active and tcp checksum offload. pass/fail in dication is provided in the error field ? ipe and tcpe. 1b 1b 1b/0b hardware provides ipv4 checksum offload if ipcs active and udp checksum offload. pass/fail indicati on is provided in the error field ? ipe and tcpe. 76 5 4321 0 rxe ipe tcpe cxe rsv seq se ce
82574 gbe controller?inline functions 120 tcpe (bit 5) - tcp/udp checksum error cxe (bit 4) - carrier extension error rsv (bit 3) - reserved seq (bit 2) - sequence error se (bit 1) - symbol error ce (bit 0) - crc error or alignment error the ip and tcp checksum error bits are valid only when the ipv4 or tcp/udp checksum(s) is performed on the received packet as indicated via ipcs and tcpcs previously mentioned. these, along with th e other error bits, are valid only when the eop and dd bits are set in the descriptor. note: receive checksum errors have no effect on packet filtering. if receive checksum offloading is disabl ed (rxcsum.ipofl and rxcsum.tuofl), the ipe and tcpe bits are 0b. the rxe bit indicates that a data error occurred during the packet reception that has been detected by the phy. this generally corresponds to signal errors occurring during the packet reception. this bit is valid only when the eop and dd bits are set and are not set in descriptors unless rctl .sbp (store-bad-packets) is set. crc errors and alignment errors are both indicated via the ce bit. software can distinguish between these errors by monitoring the respective statistics registers. 7.1.3.5 vlan tag field (16-bit, offset 48) hardware stores additional information in the receive descriptor for 802.1q packets. if the packet type is 802.1q (determined when a packet matches vet and rctl.vme = 1b), then the vlan tag field records the vlan information and the four-byte vlan information is stripped from the packet data storage. otherwise, the vlan tag field contains 0x0000. 7.1.4 extended rx descriptor if the rfctl.exsten bit is set and rctl.dtyp equals 00b, the 82574 uses the extended rx descriptor as follows: descriptor read format: 15 13 12 11 0 pri cfi vlan 63 0 0 buffer address [63:0] 8 reserved 0
121 inline functions?82574 gbe controller 7.1.4.1 buffer address (64-bit, offset 0.0) the field contains the physical address of the receive data buffer. the size of the buffer is defined by the rctl register ( rctl.bsize , rctl.bsex , rctl.dtyp and rctl. flxbuf fields). 7.1.4.2 dd (1-bit, offset 8.0) this is the location of the dd bit in the status field. the software device driver must clear this bit before it handles the receive descriptor to the 82574. the software device driver can use this bit field later on as a completion indication of the hardware. descriptor write-back format: note: light-blue fields are mutually exclusive by rxcsum.pcsd 7.1.4.3 mrq field (32-bit, offset 0.0) rss type decoding: the rss type field represents the hash type used by the rss function. 63 48 47 32 31 20 19 0 0 rss hash mrq packet checksum ip identification 8 vlan tag length extended error extended status field bit(s) description rss type 3:0 rss type indicates the type of hash function used for rss computation (see below). reserved 7:4 reserved queue 12:8 indicates the receive queue associated with the packet. it is generated by the redirection table as defined by the multiple receive queues enable field. this field is reserved when multiple receive queues are disabled. reserved 31:13 reserved packet type description 0x0 no hash computation done for this packet. 0x1 ipv4 with tcp hash used (ndistcpipv4). 0x2 ipv4 hash used (ndisipv4). 0x3 ipv6 with tcp hash used (ndistcpipv6). 0x4 ipv6 with extension header hash used (ndisipv6ex). 0x5 ipv6 hash used (ndisipv6). 0x6-0xf reserved
82574 gbe controller?inline functions 122 7.1.4.4 packet checksum (16-bit, offset 0.48) for standard 802.3 packets (non-vlan) the packet checksum is by default computed over the entire packet from the first byte of the da through the last byte of the crc, including the ethernet and ip headers. softwa re can modify the starting offset for the packet checksum calculation via the rece ive checksum control register (rxcsum). this register is described in section 10.2.5.15 . to verify the tcp/udp checksum using the packet checksum, software must adjust the packet checksum value to back out the bytes that are not part of the true tc p checksum. likewise, for fragmented udp packets, the packet checksum field can be used to accelerate udp checksum verification by the host processor. this operation is enabled by the rxcsum.ippcse bit as described in section 10.2.5.15 . for packets with vlan header the packet checksum includes the header if vlan striping is not enabled by the ctrl.vme bit. if vlan header strip is enabled, the packet checksum and the starting offset of the packet checksum exclude the vlan header. this field is mutually exclusive with the rss hash. it is enabled when the rxcsum.pcsd bit is cleared. 7.1.4.5 ip identification (16-bit, offset 0.32) this field stores the ip identification field in the ip header of the incoming packet. the software device driver should ignore this field when ipidv is not set. this field is mutually exclusive with the rss hash. it is enabled when the rxcsum.pcsd bit is cleared. 7.1.4.6 rss hash (32-bit, offset 0.32) this field is mutually exclusive with the ip identification and the packet checksum. it is enabled when the rxcsum.pcsd bit is set. this field contains the result of the microsoft* rss hash function. 7.1.4.7 extended status (20-bit, offset 8.0) pkttype (bits 19:16) - packet type ack (bit 15) - ack packet indication reserved (bits 14:11) - reserved 9 8 7 6 5 4 3 2 1 0 ipidv tst rsvd ipcs tcpcs udpcs vp rsvd eop dd 19 18 17 16 15 14 13 12 11 10 pkttype ack reserved udpv
123 inline functions?82574 gbe controller udpv (bit 10) - valid udp xsum ipidv (bit 9) - ip identification valid tst (bit 8) - time stamp taken rsvd (bit 7) - reserved ipcs (bit 6) ipv4 checksum calculated on packet - same as legacy descriptor. tcpcs (bit 5) - tcp checksum calculated on packet - same as legacy descriptor. udpcs (bit 4) - udp checksum calculated on packet. vp (bit 3) - packet is 802.1q (matched vet) - same as legacy descriptor. rsv (bit 2) - reserved eop (bit 1) - end of packet - same as legacy descriptor. dd (bit 0) - descriptor done - same as legacy descriptor. dd eop ixsm vp udpcs tcpcs ipcs: same meaning as in the legacy receive descriptor. ipcs tcpcs udpcs: the meaning of these bits is shown in the following table: unsupported packet types do not have the ipcs or tcpcs bits set. ipv6 packets do not have the ipcs bit set, but might have the tcpcs bit set if the 82574 recognized the tcp or udp packet. ipidv (bit 9): the ipidv bit indicates that the incoming packet was identified as a fragmented ipv4 packet. the ipid field contains a valid ip identification value if the rxcsum.pcsd is cleared. udpv (bit 10): the udpv bit indicates that the incoming packet contains a valid (non- zero value) checksum field in an incoming fragmented udp ipv4 packet. it means that the packet checksum field contains the udp checksum as described in this section. when this field is cleared in the first fragment that contains the udp header, it means that the packet does not contain a valid ud p checksum and the checksum field in the rx descriptor should be ignored. this field is always cleared in incoming fragments that do not contain the udp header. tcpcs udpcs ipcs functionality 0b 0b 1b/0b hardware provides ipv4 ch ecksum offload if ipcs active. 1b 0b 1b/0b hardware provides ipv4 checksum offload if ipcs active and tcp checksum offload. pass/fail in dication is provided in the error field ? ipe and tcpe. 0b 1b 1b/0b for ipv4 packets, hardware provides ip checksum offload if ipcs active and fragmented udp chec ksum offload. ip pass/fail indication is provided in the ipe field. fragmented udp checksum is provided in the packet checksum field if the rxcsum.pcsd bit is cleared. 1b 1b 1b/0b hardware provides ipv4 checksum offload if ipcs active and udp checksum offload. pass/fail in dication is provided in the error field ? ipe and tcpe.
82574 gbe controller?inline functions 124 ack (bit 15): the ack bit indicates that the received packet was an ack packet with or without tcp payload depending on the rfctl.ackd_dis bit. pkttype (bit 19:16): the pkttype field defines the type of the packet that was detected by the 82574. the 82574 tries to find the most complex match until the most common one as shown in the following packet type table: ? payload does not mean raw data bu t can also be unsupported header. ? if there is an nfs/iscsi header in the packets it can be seen in the packet type field. note: if the device is not configured to provide an y offload that requires packet parsing, the packet type field is set to 0b regardless of the actual packet type. 7.1.4.8 extended errors (12-bit, offset 8.20) rxe (bit 11) - rx data error - same as legacy descriptor. ipe (bit 10) - ipv4 checksum error - same as legacy descriptor. tcpe (bit 9) - tcp/udp checksum error - same as legacy descriptor. cxe (bit 8) - carrier extension error - same as legacy descriptor. seq (bit 6) - sequence error - same as legacy descriptor. se (bit 5) - symbol error - same as legacy descriptor. packet type description 0x0 mac, (vlan/snap) payload 0x1 mac, (vlan/snap) ipv4, payload 0x2 mac, (vlan/snap) ipv4, tcp/udp, payload 0x3 mac (vlan/snap), ipv4, ipv6, payload 0x4 mac (vlan/snap), ipv4, ipv6, tcp/udp, payload 0x5 mac (vlan/snap), ipv6, payload 0x6 mac (vlan/snap), ipv6,tcp/udp, payload 0x7 mac, (vlan/snap), ipv4, tcp, iscsi, payload 0x8 mac, (vlan/snap), ipv4, tcp/udp, nfs, payload 0x9 mac (vlan/snap), ipv4, ipv6,tcp, iscsi, payload 0xa mac (vlan/snap), ipv4, ipv6,tcp/udp,nfs, payload 0xb mac (vlan/snap), ipv6,tcp, iscsi, payload 0xc mac (vlan/snap), ipv6,tcp/udp, nfs, payload 0xd reserved 0xe ptp packet (timesync according to ethertype) 11109876543210 rxe ipe tcpe cxe rsvd seq se ce rsvd rsvd
125 inline functions?82574 gbe controller ce (bit 4) - crc error or alignment error - same as legacy descriptor. reserved (bits 7, 3:0) - reserved rxe ipe tcpe cxe seq se ce : same as legacy descriptor. length (16-bit, offset 8.32): same as the length field at offset 8.0 in the legacy descriptor. vlan tag (16-bit, offset 8.48): same as legacy descriptor. 7.1.4.8.1 receive udp fragmentation checksum the 82574 might provide receive fragmented udp checksum offload. the following setup should be made to enable this mode: rxcsum.pcsd bit should be cleared. the packet checksum and ip identification fields are mutually exclusive with the rss hash. when the pcsd bit is cleared, packet checksum and ip identification are active. rxcsum.ippcse bit should be set. this field enables the ip payload checksum enable that is designed for the fragmented udp checksum. rxcsum.pcss field must be zero. the packet checksum start should be zero to enable auto start of the checksum calculation. see the following table for an exact description of the checksum calculation. the following table lists the outcome descr iptor fields for the following incoming packets types: note: when the software device driver computes the 16-bit ones complement sum on the incoming packets of the udp fragments, it should expect a value of 0xffff. see section 7.1.10 for supported packet formats. incoming packet type packet checksum ip identification udpv/ipidv udpcs/tcpcs none ipv4 packet unadjusted 16-bit ones complement checksum of the entire packet (excluding vlan header) reserved 0b/0b 0b/0b fragment ipv4 with tcp header same as above incoming ip identification 0b/1b 0b/0b non- fragmented ipv4 packet same as above reserved 0b/0b depend on transport header and tuofl field fragmented ipv4 without transport header the unadjusted 1?s complement checksum of the ip payload incoming ip identification 0b/1b 1b/0b fragmented ipv4 with udp header same as above incoming ip identification 1b if the udp header checksum is valid/1b 1b/0b
82574 gbe controller?inline functions 126 7.1.5 packet split receive descriptor the 82574 uses the packet split feature when the rfctl.exsten bit is set and rctl.dtyp =01b. the software device driver must also program the buffer sizes in the psrctl register. descriptor read format: 7.1.5.1 buffer addresses [3:0] (4 x 64 bit) the physical address of each buffer is written in the buffer addresses fields. the sizes of these buffers are statically defined by bsize0-bsize3 in the psrctl register. note: software notes: ? all buffers' addresses in a packet sp lit descriptor must be word aligned. ? packet header can't span across buffers, therefore, the size of the first buffer must be larger than any expected header size . otherwise the packet will not be split. ? if software sets a buffer size to zero, all buffers following that one should be set to zero as well. pointers in the packet split receive descriptors to buffers with a zero size should be set to any address, but not to null pointers. hardware does not write to this address. ? when configured to packet split and a given packet spans across two or more packet split descriptors, the first buffer of any descriptor (other than the first one) is not used. 7.1.5.2 dd (1-bit, offset 8.0) the software device driver might use the dd bit from the status field to determine when a descriptor has been used. therefore, the software device driver must ensure that the least significant b (lsb) of buffer address 1 is zero. this should not be an issue, since the buffers should be page aligne d for the packet split feature to be useful. note: any software device driver that cannot align buffers should not be using this descriptor format. 63 0 0 buffer address 0 8 buffer address 1 0 16 buffer address 2 24 buffer address 3
127 inline functions?82574 gbe controller descriptor write-back format: note: light-blue fields are mutually exclusive by rxcsum.pcsd mrq - same as extended rx descriptor. packet checksum, ip identification, rss hash - same as extended rx descriptor. extended status, extended errors, vlan tag - same as extended rx descriptor. 7.1.5.3 length 0 (16-bit, offset 8.32), length [3:1] (3- x 16-bit, offset 16.16) upon a packet reception, hardware stores the packet data in one or more of the indicated buffers. hardware writes in the length field of each buffer the number of bytes that were posted in the corresponding buffer. if no packet data is stored in a given buffer, hardware writes 0b in the corresponding length field. length covers the data written to receive buffer including crc bytes (if any). software is responsible for checking the length fields of all buffers for data that hardware might have written to the corresponding buffers. 7.1.5.4 header status (16-bit, offset 16.0): hdrsp (bit 15) - headers were split reserved (b its 14:10) - reserved header length (bits 9:0) - packet header length hdrsp (bit 15): the hdrsp bit (when active) indicates that hardware split the headers from the packet data for the packet contained in this descriptor. the following table identifies the packets that are supported by header/data split functionality. in addition, packets with a data portion smalle r than 16 bytes are no t guaranteed to be split. if the device is not configured to prov ide any offload that requires packet parsing, the hdrsp bit is set to 0b' even if packet split was enabled. non-split packets are stored linearly in the buffers of the receive descriptor. 63 48 47 32 31 20 19 16 15 0 0 rss hash mrq packet checksum ip identification 8 vlan tag length 0 extended error extended status 1 6 length 3 length 2 length 1 header status 2 4 reserved 15 14 10 9 0 hdrsp reserved hlen (header length)
82574 gbe controller?inline functions 128 hlen (bit 9:0): the hlen field indicates the header length in byte count that was analyzed by the 82574. the 82574 posts the first hlen bytes of the incoming packet to buffer zero of the rx descriptor. packet types supported by the packet split: the 82574 provides header split for the packet types listed in the following table. other packet types are posted sequentially in the buffers of the packet split receive buffers. note: a header of a fragmented ipv6 packet is de fined until the fragmented extension header. note: if the device is not configured to provide an y offload that requires packet parsing, the packet type field is set to 0b regardless of the actual packet type. when packet split is enabled, the packet type field is always valid. packet type description header split 0x0 mac, (vlan/snap), payload no. 0x1 mac, (vlan/snap), ipv4, payload split header after l3 if fragmented packets. 0x2 mac, (vlan/snap), ipv4, tcp/udp, payload split header after l4 if not fragmented, otherwise treat as packet type 1. 0x3 mac (vlan/snap), ipv4, ipv6, payload split header after l3 if either ipv4 or ipv6 indicates a fragmented packet. 0x4 mac (vlan/snap), ipv4, ipv6,tcp/udp, payload split header after l4 if ipv4 not fragmented and if ipv6 does not include fragment extension header, otherwise treat as packet type 3. 0x5 mac (vlan/snap), ipv6, payload split header after l3 if fragmented packets. 0x6 mac (vlan/snap), ipv6,tcp/udp, payload split header after l4 if ipv6 does not include fragment extension header, otherwise treat as packet type 5. 0x7 mac, (vlan/snap) ipv4, tcp, iscsi, payload split header after l5 if not fragmented, otherwise treat as packet type 1. 0x8 mac, (vlan/snap) ipv4, tcp/udp, nfs, payload split header after l5 if not fragmented, otherwise treat as packet type 1. 0x9 mac (vlan/snap), ipv4, ipv6, tcp, iscsi, payload split header after l5 if ipv4 not fragmented and if ipv6 does not include fragment extension header, otherwise treat as packet type 3. 0xa mac (vlan/snap), ipv4, ipv6, tcp/udp,nfs, payload split header after l5 if ipv4 not fragmented and if ipv6 does not include fragment extension header, otherwise treat as packet type 3. 0xb mac (vlan/snap), ipv6, tcp, iscsi, payload split header after l5 if ipv6 does not include fragment extension header, otherwise treat as packet type 5. 0xc mac (vlan/snap), ipv6, tcp/udp, nfs, payload split header after l5 if ipv6 does not include fragment extension header, otherwise treat as packet type 5. 0xd reserved 0xe ptp packet (timesync according to ethertype) no.
129 inline functions?82574 gbe controller 7.1.6 receive descriptor fetching the fetching algorithm attempts to make the best use of pcie bandwidth by fetching a cache-line (or more) descriptor with each burst. the following paragraphs briefly describe the descriptor fetch algorithm and the software control provided. when the on-chip buffer is empty, a fetch happens as soon as any descriptors are made available (host writes to the tail pointer). when the on-chip buffer is nearly empty (rxdctl.pthresh), a prefetch is performe d each time enough valid descriptors (rxdctl.hthresh) are available in host memo ry and no other pcie activity of greater priority is pending (descriptor fetches and write backs or packet data transfers). when the number of descriptors in host memory is greater than the available on-chip descriptor storage, the chip might elect to perform a fetch that is not a multiple of cache line size. the hardware performs this no n-aligned fetch if doing so results in the next descriptor fetch being aligned on a cache line boundary. this enables the descriptor fetch mechanism to be most efficient in the cases where it has fallen behind software. note: the 82574 never fetches descriptors beyond the descriptor tail pointer. 7.1.7 receive descriptor write back processors have cache line sizes that are la rger than the receive descriptor size (16 bytes). consequently, writing back descriptor information for each received packet can cause expensive partial cache line updates. two mechanisms minimize the occurrence of partial line write backs: ? receive descriptor packing ? null descriptor padding the following sections explain these mechanisms. 7.1.7.1 receive descriptor packing to maximize memory efficiency, receive descriptors are packed together and written as a cache line whenever possible. descriptors accumulate and are opportunistically written out in cache line-oriented chunks. us ed descriptors are also explicitly written out under the following scenarios: ? rxdctl.wthresh descriptors have been used (the specified maximum threshold of unwritten used descriptors has been reached) ? the last descriptors of the allocated desc riptor ring have been used (to enable hardware to re-align to the descriptor ring start) ? a receive timer expires (radv or rdtr) ? explicit software flush (rdtr.fpd) when the number of descriptors specified by rxdctl.wthresh have been used, they are written back, regardless of cache line alig nment. it is therefore recommended that wthresh be a multiple of cache line size. when a receive timer (radv or rdtr) expires, all used descriptors are forced to be written back prior to initiating the interrupt, for consistency. software might ex plicitly flush accumulated descriptors by writing the rdtr register with the high order bit (fpd) set.
82574 gbe controller?inline functions 130 7.1.7.2 null descriptor padding hardware stores no data in descriptors with a null data address. software can make use of this property to cause the first co ndition under receive descriptor packing to occur early. hardware writes back null descriptors with the dd bit set in the status byte and all other bits unchanged. note: null descriptor padding is not supported for packet split descriptors. 7.1.8 receive descriptor queue structure figure 26 shows the structure of the two receiv e descriptor rings. hardware maintains two circular queues of descriptors and writes back used descriptors just prior to advancing the head pointer(s). head and tail pointers wrap back to base when size descriptors have been processed. figure 26. receive descriptor ring structure software adds receive descriptors by adva ncing the tail pointer(s) to refer to the address of the entry just beyond the last valid descriptor. this is accomplished by writing the descriptor tail register(s) with the offset of the entry beyond the last valid descriptor. the hardware adjusts its internal tail pointer(s) accordingly. as packets arrive, they are stored in memory and the he ad pointer(s) is incremented by hardware. when the head pointer(s) is equal to the tail pointer(s), the queue(s) is empty. hardware stops storing packets in system memory until software advances the tail pointer(s), making more receive buffers available. circular buffer queues head base + size base receive queu e tail
131 inline functions?82574 gbe controller the receive descriptor head and tail pointe rs reference 16-byte blocks of memory. shaded boxes in the figure represent descriptors that have stored incoming packets but have not yet been recognized by software. software can determine if a receive buffer is valid by reading descriptors in memory rather than by i/o reads. any descriptor with a non-zero status byte has been processed by the hardware, and is ready to be handled by the software. note: when configured to work as a packet split feature, the descriptor tail needs to be increment by software by two for every descriptor ready in memory (as the packet split descriptors are 32 bytes while regular descriptors are 16 bytes). note: the head pointer points to the next descriptor that will be written back. at the completion of the descriptor write-back oper ation, this pointer is incremented by the number of descriptors written back. hardwa re owns all descriptors between [head... tail]. any descriptor not in this range is owned by software. the receive descriptor rings are descr ibed by the following registers: ? receive descriptor base address registers (rdba0, rdba1) ? this register indicates the start of the descriptor ring buffer; this 64-bit address is aligned on a 16-byte boundary and is stored in two consecutive 32-bit registers. hardware ignores the lower 4 bits. ? receive descriptor length registers (rdlen0, rdlen1) ? this register determines the number of bytes allocated to the circular buffer. this value must be a multiple of 128 (t he maximum cache line size). since each descriptor is 16 bytes in length, the to tal number of receive descriptors is always a multiple of 8. ? receive descriptor head registers (rdh0, rdh1) ? this register holds a value that is an offset from the base, and indicates the in- progress descriptor. there can be up to 64 kb descriptors in the circular buffer. hardware maintains a shadow copy that includes those descriptors completed but not yet stored in memory. ? receive descriptor tail registers (rdt0, rdt1) ? this register holds a value that is an offset from the base, and identifies the location beyond the last descriptor hardware can process. this is the location where software writes the first new descriptor. if software statically allocates buffers, an d uses memory read to check for completed descriptors, it simply has to zero the status byte in the descriptor to make it ready for reuse by hardware. this is not a hardware requirement (moving the hardware tail pointer(s) is), but is necessary for performing an in-memory scan.
82574 gbe controller?inline functions 132 7.1.9 receive interrupts the following indicates the presence of new packets: ? receive timer (icr.rxt0) due to packet delay timer (rdtr) a predetermined amount of time has elapsed since the last packet was received and transferred to host memory. every time a new packet is received and transferred to the host memory, the timer is re-initialized to the predetermined value. the timer then counts down and triggers an interrupt if no new packet is received and transferred to host memory completely before the timer expires. software can set the timer value to zero if it needs to be notified immediatel y (no interval delay) whenever a new packet has been stored in memory. writing the absolute timer with its high order bit set to 1b forces an explicit flush of any partial cache lines worth of consumed descriptors. hardware writes all used descriptors to memory and updates the globally visi ble value of the rxdh head pointer(s). this timer is re-initialized when an inte rrupt is generated and restarts when a new packet is observed. it stays disabled until a new packet is received and transferred to the host memory. the packet delay timer is also re-initialized when an interrupt occurs due to an absolute timer expiration or small packet-detection interrupt. ? receive timer (icr.rxt0) due to absolute timer (radv) a predetermined amount of time has elapsed since the first packet received after the hardware timer was written (specifically, after the last packet data byte was written to memory). this timer is re-initialized when an inte rrupt is generated and restarts when a new packet is observed. it stays disabled until a new packet is received and transferred to the host memory. the absolute delay timer is also re-initialized when an interrupt occurs due to a packet timer expiration or small packet-detection interrupt. the absolute timer and the packet delay ti mer can be used together. the following table lists the conditions when the absolute timer and the packet delay timer are initialized, disabled and when they start count ing. the timer is always disabled if the value of the rdtr = 0b. figure 27 further clarifies the packet timer operation. interrupt timers when starts counting when re-initialized when disabled absolute delay timer timer inactive and receive packet transferred to host memory. at start on expiration due to other receive interrupt. packet delay timer timer inactive and receive packet transferred to host memory. at start new packet received and transferred to host memory on expiration due to other receive interrupt.
133 inline functions?82574 gbe controller figure 27. packet delay timer operation (with state diagram) figure 28 shows how the packet timer and abso lute timer can be used together: disabled running packet received & xferred to host mem action: re-initialized packet received & xfer to host memory action - re-initialize int generated timer expires other receive interrupts initial state
82574 gbe controller?inline functions 134 figure 28. packet and absolute timers ? small receive packet detect (icr.srpd) ? a receive interrupt is asserted when sm all-packet detection is enabled (rsrpd is set with a non-zero value) and a packet of (size < rsrpd.size) has been transferred into the host memory. when comparing the size the headers and crc are included (if crc stripping is not enabled). crc and vlan headers are not included if they have been stri pped. a receive timer interrupt cause (icr.rxt0) will also be noted when the small packet-detect interrupt occurs. ? receive ack frame interrupt is asserted when a frame is detected to be an ack frame. detection of ack frames are ma sked through the ims register. when a frame is detected as an ack frame an interrupt is asserted after the raid.ack_delay timer had expired and the ack frames interrupts were not masked in the ims register. note: the ack frame detect feature is only active when configured to packet split (rctl.dtyp=01b) or the extended status feature is enabled (rfctl.exsten is set). a bsolute timer value pkt #1 pkt #2 pkt #3 pkt #4 interrupt generated due to pkt #1 a bsolute timer value pkt #1 pkt #2 pkt #3 pkt #4 interrupt generalted (due to pkt #4) as absolute timer expires. packet delay timer disabled untill next packet is received and transferred to host memory. pkt #5 pkt #6 ... ... ... a bsolute timer value 1) packet timer expires 2) interrupt generated 3) absolute timer reset a bsolute timer value pkt #1 pkt #2 pkt #3 pkt #4 interrupt generalted (due to pkt #4) as absolute timer expires. packet delay timer disabled untill next packet is received and transferred to host memory. pkt #5 pkt #6 ... ... ... a bsolute timer value 1) packet timer expires 2) interrupt generated 3) absolute timer reset case a: using only an absolute timer case b: using an absolute time in conjunction with the packet timer case c: packet timer expiring while a packet is transferred to host memory. illustrates that packet timer is re-started only after a packet is transferred to host memory.
135 inline functions?82574 gbe controller receive interrupts can also be generated for the following events: ? receive descriptor minimum threshold (icr.rxdmt) ? the minimum descriptor threshold he lps avoid descriptor under-run by generating an interrupt when the numbe r of free descriptors becomes equal to the minimum. it is measured as a fracti on of the receive descriptor ring size. this interrupt would stop and re-initialize the entire active delayed receives interrupt timers until a new packet is observed. ? receiver fifo overrun (icr.rxo) ? fifo overrun occurs when hardware attempts to write a byte to a full fifo. an overrun could indicate that software has not updated the tail pointer(s) to provide enough descriptors/buffers, or th at the pcie bus is too slow draining the receive fifo. incoming packets that overrun the fifo are dropped and do not affect future packet reception. this interrupt would stop and re-initialize the entire active delayed receives interrupts. 7.1.10 receive packet checksum offloading the 82574 supports the offloading of three receive checksum calculations: the packet checksum, the ipv4 header checksum, and the tcp/udp checksum. the packet checksum is the one's complement over the receive packet, starting from the byte indicated by rxcsum.pcss (zero corre sponds to the first byte of the packet), after stripping. for packets with vlan header the packet checksum includes the header if vlan striping is not enabled by the ctrl.vme. if vlan header strip is enabled, the packet checksum and the starting offset of the packet checksum exclude the vlan header due to masking of vlan header. for example, for an ethernet ii frame encapsulated as an 802.3ac vlan packet and ctrl.vme is set and with rxcsum.pcss set to 14, the packet checksum would include the entire encapsulated frame, excluding the 14-byte ethernet header (da, sa, type/length) and the 4-byte q-tag. the packet checksum does not include the ethernet crc if the rctl.secrc bit is set. software must make the required offsetting computation (to back out the bytes that should not have been included and to incl ude the pseudo-header) prior to comparing the packet checksum against the tcp checksum stored in the packet. for supported packet/frame types, the entire checksum calculation can be offloaded to the 82574. if rxcsum.ipofld is set to 1b, the 82574 calculates the ipv4 checksum and indicates a pass/fail indication to software via the ipv4 checksum error bit ( rdesc.ipe ) in the error field of the receive descriptor. similarly, if rxcsum.tuofld is set to 1b, the 82574 calculates the tcp or udp checksum and indicates a pass/fail condition to software via the t cp/udp checksum error bit ( rdesc.tcpe ). these error bits are valid when the respective status bits indicate the checksum was calculated for the packet ( rdesc.ipcs and rdesc.tcpcs respectively). similarly, if rfctl.ipv6_dis and rfctl.ip6xsum_dis are cleared to 0b and rxcsum.tuofld is set to 1b, the 82574 calculates the tcp or udp checksum for ipv6 packets. it then indicates a pass/ fail condition in the tcp/udp checksum error bit ( rdesc.tcpe ). if neither rxcsum.ipofld nor rxcsum.tuofld are set, the checksum error bits ( ipe and tcpe ) are 0b for all packets. supported frame types: ? ethernet ii ? ethernet snap
82574 gbe controller?inline functions 136 table 35. supported receiv e checksum capabilities the previous table lists the general details about what packets are processed. in more detail, the packets are passed through a seri es of filters to determine if a receive checksum is calculated: 7.1.10.1 mac address filter this filter checks the mac destination address to be sure it is valid (such as, ia match, broadcast, multicast, etc.). the receive configuration settings determine which mac addresses are accepted. see the various receive control configuration registers such as rctl (rtcl.upe, rctl.mpe, rc tl.bam), mta, ral, and rah. 7.1.10.2 snap/vlan filter this filter checks the next headers looking fo r an ip header. it is capable of decoding ethernet ii, ethernet snap, and ieee 802. 3ac headers. it skips past any of these intermediate headers and looks for the ip header. the receive configuration settings determine which next headers are accepted. see the various receive control configuration registers such as rctl (rctl.vfe), vet, and vfta. packet type hw ip checksum calculation hw tcp/udp checksum calculation ipv4 packets yes yes ipv6 packets no (n/a) yes ipv6 packet with next header options: hop-by-hop options destinations options routing (with len 0) routing (with len >0) fragment home option no (n/a) no (n/a) no (n/a) no (n/a) no (n/a) no (n/a) yes yes yes no no no ipv4 tunnels: ipv4 packet in an ipv4 tunnel ipv6 packet in an ipv4 tunnel no yes (ipv4) no yes 1 1. the ipv6 header portion can include supported extens ion headers as described in the ipv6 filter section. ipv6 tunnels: ipv4 packet in an ipv6 tunnel ipv6 packet in an ipv6 tunnel no no no no packet is an ipv4 fragment yes no packet is greater than 1552 bytes; (lpe=1b) yes yes packet has 802.3ac tag yes yes ipv4 packet has ip options (ip header is longer than 20 bytes) yes yes packet has tcp or udp options yes yes ip header?s protocol field contains a protocol # other than tcp or udp. yes no
137 inline functions?82574 gbe controller 7.1.10.3 ipv4 filter this filter checks for valid ipv4 headers. the version field is checked for a correct value (4). ipv4 headers are accepted if they are any si ze greater than or equal to 5 (dwords). if the ipv4 header is properly decoded, the ip checksum is checked for validity. the rxcsum.ipofl bit must be set for this filter to pass. 7.1.10.4 ipv6 filter this filter checks for valid ipv6 headers, which are a fixed size and have no checksum. the ipv6 extension headers accepted are: hop-by-hop, destination options, and routing. the maximum size next header accepted is 16 dwords (64 bytes). all of the ipv6 extension headers suppor ted by the 82574 have the same header structure: next header is a value that identifies the header type. the supported ipv6 next headers values are: ? hop-by-hop = 0x00 ? destination options = 0x3c ? routing = 0x2b hdr ext len is the 8-byte count of the head er length, not including the first 8 bytes. for example, a value of three means that the total header size including the next header and hdr ext len fields is 32 bytes (8 + 3*8). the rfctl.ipv6_dis bit must be cleared for this filter to pass. 7.1.10.5 udp/tcp filter this filter checks for a valid udp or tcp he ader. the prototype next header values are 0x11 and 0x06, respectively. the rxcsum.tuofl bit must be set for this filter to pass. 7.1.11 multiple receive queues and receive-side scaling (rss) the 82574 provides two hardware receive queu es and filters each receive packet into one of the queues based on criteria that is described as follows. classification of packets into receive queues have several uses, such as: ? receive side scaling (rss) ? generic multiple receive queues ? priority receive queues. byte0 byte1 byte2 byte3 next header hdr ext len
82574 gbe controller?inline functions 138 however, rss is the only usage that is descr ibed specifically. other uses should make use of the available hardware. multiple receive queues are enabled when the rxcsum.pcsd bit is set (packet checksum is disabled) and the multiple receive queues enable bits are not 00b. multiple receive queues are therefore mutual ly exclusive with udp fragmentation, and is unsupported when using legacy receive descriptor format; multiple receive queue status is not reported in the receive pack et descriptor, and the interrupt mechanism bypasses the interrupt scheme described in section 7.1.11 . instead, a receive packet is issued directly to the interrupt logic. when multiple receive queues are enabled, the 82574 provides software with several types of information. some are requirements of microsoft* rss while others are provided for software device driver assistance: ? a dword result of the microsoft* rss hash function, to be used by the stack for flow classification, is written into the receive packet descriptor (required by microsoft* rss). ?a 4-bit rss type field conveys the hash function used for the specific packet (required by microsoft* rss). ? a mechanism to issue an interrupt to one or more cpus ( section 7.1.11 ). figure 29 shows the process of classifying a packet into a receive queue: 1. the receive packet is parsed into the header fields used by the hash operation (such as, ip addresses, tcp port, etc.). 2. a hash calculation is performed. the 82 574 supports a single hash function, as defined by microsoft* rss. the 82574 therefore does not indicate to the software device driver which hash function is used. the 32-bit result is fed into the packet receive descriptor. 3. the seven lsbs of the hash result are used as an index into a 128-entries redirection table. each entry in the table contains a 5-bit cpu number. this 5-bit value is fed into the packet receive descrip tor. in addition, each entry provides a single bit queue number, whic h denotes that queue into which the packet is routed. when multiple requests queues are disabled, packets enter hardware queue 0. system software might enable or disable rss at any time. while disabled, system software might update the contents of any of the rss-related registers. when multiple request queues are enabled in rss mode, undecodable packets enter hardware queue 0. the 32-bit tag (normally a result of the hash function) equals zero. the 5-bit mrq field equals zero as well.
139 inline functions?82574 gbe controller figure 29. rss block diagram 7.1.11.1 rss hash function the 82574?s hash function follows microsoft?s* definition. a single hash function is defined with five variations for the following cases: ? tcpipv4 - the 82574 parses the packet to identify an ipv4 packet containing a tcp segment per the following criteria. if the packet is not an ipv4 packet containing a tcp segment, receive-side-scaling is not done for the packet. ? ipv4 - the 82574 parses the packet to identi fy an ipv4 packet. if the packet is not an ipv4 packet, receive-side-scaling is not done for the packet. ? tcpipv6 - the 82574 parses the packet to identify an ipv6 packet containing a tcp segment per the following criteria. if the packet is not an ipv6 packet containing a tcp segment, receive-side-scaling is not done for the packet. extension headers should be parsed for a home-address-option field (for source address) or the routing-header-type-2 field (for destination address). redirection table (128 x 8) physical queue # 1 bit 0 mrq disables or (rss & not decodeable) rss hash parsed receive packet ls ls 32 packet descriptor 1 7
82574 gbe controller?inline functions 140 ? ipv6ex - the 82574 parses the packet to identify an ipv6 packet. extension headers should be parsed for a home-address-option field (for source address) or the routing-header-type-2 field (for destination address). note that the packet is not required to contain any of these extension headers to be hashed by this function. if the packet is not an ipv6 pa cket, receive-side-scaling is not done for the packet. ? ipv6 - the 82574 parses the packet to identify an ipv6 packet. if the packet is not an ipv6 packet, receive-side-scaling is not done for the packet. two configuration bits impact the choice of the hash function as previously described: ? ipv6_exdis bit in receive filter control (rfctl) register: when set, if an ipv6 packet includes extension headers, then the tcpipv6 and ipv6ex functions are not used. ? new_ipv6_ext_dis bit in receive filter control (rfctl) register: when set, if an ipv6 packet includes either a home-address-option or a routing-header-type-2 , then the tcpipv6 and ipv6ex functions are not used. a packet is identified as containing a tcp se gment if all of the fo llowing conditions are met: ? the transport layer protocol is tcp (not udp, icmp, igmp, etc.). ? the tcp segment can be parsed (such as, ip parsed options, packet not encrypted). ? the packet is not fragmented (even if the fragment contains a complete tcp header). bits[31:16] of the multiple receive queues command register enable each of the hash function variations (several can be set at a given time). if several functions are enabled at the same time, priority is defined as fo llows (skip functions that are not enabled): ipv4 packet: 1. try using the tcpipv4 function. if does not meet the requirements, try 2. 2. try using the ipv4 function. ipv6 packet: 1. try using the tcpipv6 function. if does not meet the requirements, try 2. 2. try using the ipv6ex function. if do es not meet the requirements, try 3. 3. try using the ipv6 function. the following combinations are currently supported. other combinations might be supported in future products. ipv4 hash types: ? s1a - tcpipv4 is enabled as defined above, or ? s1b - both tcpipv4 and ipv4 are enabled - the packet is first parsed according to tcpipv4 rules. if not identified as a tcpipv4 packet, it is then parsed as an ipv4 packet.
141 inline functions?82574 gbe controller ipv6 hash types: ? s2a - tcpipv6 is enabled as defined above, or ? s2b - tcpipv6, ipv6ex, and ipv6 are enabled - the packet is first parsed according to tcpipv6 rules. if not identified as a tcpipv6 packet, it is then parsed as an ipv6ex packet. if the 82574 cannot parse extensions headers (such as an unidentified extension in the packet), then the packet is parsed as ipv6. when a packet cannot be parsed by the abov e rules, it enters hardware queue 0. the 32-bit tag (normally a result of the hash function) equals zero. the 5-bit mrq field equals zero as well. the 32-bit result of the hash computation is written into the packet descriptor and also provides an index into the redirection table. the following notation is used to describe the hash functions below: ? ordering is little endian in both byte s and bits. for example, the ip address 161.142.100.80 translates into 0xa18e6450 in the signature. ? a " ^ " denotes bit-wise xor operation of same-width vectors. ? @x-y denotes bytes x through y (including both of them) of the incoming packet, where byte 0 is the first byte of the ip header. in other words, we consider all byte- offsets as offsets into a packet where th e framing layer header has been stripped out. therefore, the source ipv4 address is referred to as @12-15, while the destination v4 address is referred to as @16-19. ? @x-y, @v-w denotes concatenation of byte s x-y, followed by bytes v-w, preserving the order in which they occurred in the packet. all hash function variations (ipv4 and ipv6) follow the same general structure. specific details for each variation are described in the following section. the hash uses a random secret key of length 320 bits (40 bytes); the key is generated through the rss random key (rssrk) register. the algorithm works by examining each bit of the hash input from left to right. our nomenclature defines left and right for a byte-array as follows: given an array k with k bytes, our nomenclature assumes that the array is laid out as follows: k[0] k[1] k[2] ? k[k-1] k[0] is the left-most byte, and the msb of k[0] is the left-most bit. k[k-1] is the right- most byte, and the lsb of k[k-1] is the right-most bit. computehash(input[], n) for hash-input input[] of length n bytes (8n bits) and a random secret key k of 320 bits result = 0; for each bit b in input[] { if (b == 1) then result ^= (left-most 32 bits of k); shift k left 1 bit position; } return result;
82574 gbe controller?inline functions 142 the following four pseudo-code examples are intended to help clarify exactly how the hash is to be performed in four cases, ip v4 with and without ability to parse the tcp header, and ipv6 with an without a tcp header. 7.1.11.1.1 hash for ipv4 with tcp concatenate sourceaddress, destinationaddress, sourceport, destinationport into one single byte-array, preserving the order in which they occurred in the packet: input[12] = @12-15, @16-19, @20-21, @22-23. result = computehash(input, 12); 7.1.11.1.2 hash for ipv4 without tcp concatenate sourceaddress and destinationaddress into one single byte-array input[8] = @12-15, @16-19 result = computehash(input, 8) 7.1.11.1.3 hash for ipv6 with tcp similar to above: input[36] = @8-23, @24-39, @40-41, @42-43 result = computehash(input, 36) 7.1.11.1.4 hash for ipv6 without tcp input[32] = @8-23, @24-39 result = computehash(input, 32) 7.1.11.2 redirection table the redirection table is a 128-entry structure, indexed by the seven lsbs of the hash function output. each entry of the table contains the following: ? bit [7] - queue index ? bits [6:0] - reserved the queue index determined the physical queue for the packet. the contents of the redirection table are not defined following reset of the memory configuration registers. system software mu st initialize the table prior to enabling multiple receive queues. it might also update the redirection table during run time. such updates of the table are not synchron ized with the arrival time of received packets. therefore, it is not guaranteed that a table update takes effect on a specific packet boundary.
143 inline functions?82574 gbe controller 7.1.11.3 rss verification suite assume that the random key byte-stream is: 0x6d, 0x5a, 0x56, 0xda, 0x25, 0x5b, 0x0e, 0xc2, 0x41, 0x67, 0x25, 0x3d, 0x43, 0xa3, 0x8f, 0xb0, 0xd0, 0xca, 0x2b, 0xcb, 0xae, 0x7b, 0x30, 0xb4, 0x77, 0xcb, 0x2d, 0xa3, 0x80, 0x30, 0xf2, 0x0c, 0x6a, 0x42, 0xb7, 0x3b, 0xbe, 0xac, 0x01, 0xfa ipv4 ipv6 - the ipv6 address tuples are only for verification purposes, and might not make sense as a tuple). 7.2 packet transmission 7.2.1 transmit functionality the 82574 transmit flow is a descriptor-base d transmit where the hardware gets the per-packet details for the transmit tasks through descriptors created by software. this section outlines the transmit structures and process along with features and offloads supported by the 82574. destination address/ port source address/port ipv4 only ipv4 with tcp 161.142.100.80:1766 66.9.149.187:2794 0x323e8fc2 0x51ccc178 65.69.140.83:4739 199.92.111.2:14230 0xd718262a 0xc626b0ea 12.22.207.184:38024 24.19.198.95:12898 0xd2d0a5de 0x5c2b394a 209.142.163.6:2217 38.27.205.30:48228 0x82989176 0xafc7327f 202.188.127.2:1303 153.39.163.191:44251 0x5d1809c5 0x10e828a2 destination address/port source address/port ipv6 only ipv6 with tcp 3ffe:2501:200:1fff::7 (1766) 3ffe:2501:200:3::1 (2794) 0x2cc18cd5 0x40207d3d ff02::1 (4739) 3ffe:501:8::260:97ff:fe40:efab (14230) 0x0f0c461c 0xdde51bbf fe80::200:f8ff:fe21:67cf (38024) 3ffe:1900:4545:3:200:f8ff:fe21:67cf (44251) 0x4b61e985 0x02d1feef
82574 gbe controller?inline functions 144 7.2.2 transmission flow using si mplified legacy descriptors 7.2.3 transmission process flow using extended descriptors the 82574 supports extended tx descriptors th at provide more offload capabilities. the extended offload capabilities are indicated to the hardware by two types of descriptors: context descriptors and data descriptors. the context descriptors define a set of offload capabilities applicable for multiple packets while the data descriptors define the data buffers and specific off load capabilities per packet. the software/hardware flow while using the extended descriptors is as follows: ? software prepares the context descriptor that defines the offload capabilities for the incoming packets. ? software prepares the data packets in host memory within one or more data buffers and their descriptors. ? all steps are the same as the legacy tx descriptors as previously described (starting at step number 4) while the data buffers belong to a single packet. the software/hardware flow for tcp segmentati on using the extended descriptors is as follows: ? software prepares the context descriptor that defines the upcoming tcp segmentation, in this case, the data buffers belong to multiple packets. ? software places a prototype header in host memory and indicates it to the hardware by a data descriptor. 1 software defines a descriptor ring and configur es the 82574's transmit qu eue with the address location, length, head, and tail pointers of the ring. this step is executed once per tx descriptor ring. see section 7.2.4 for more details on the descriptor ring structure. 2 software prepares the packet headers and data fo r the transmit within one or more data buffers. 3 software prepares tx descriptors according to the number of data buffers that are used. each descriptor points to a different data buffer and holds the required hardware processing. see section 7.2.10 for more details on the descriptor format. the software places the descriptors in the correct location in the tx descriptor ring. 4 software updates the transmit descriptor tail pointer (tdt) to indicate the hardware that tx descriptors are ready. 5 hardware senses a change of the tdt and initiates a pcie request to fetch the descriptors from host memory. 6 the descriptors? content is received in a pcie re ad completion and is written to the appropriate location in the descriptor queue. 7 according to the descriptors cont ent the corresponding memory data buffers are then fetched from the host to the hardware on-chip transmit fifo. while the packet is passing through the dma an d mac units, relevant off load functions are incorporated according to the commands in the descriptors. 10 after the entire packet is fetched by the hard ware it is transmitted to the ethernet link. 11 after a dma of each buffer is complete, if the rs bit in the command byte is set, the dma updates the status field of the appropriate descriptor and writes back the descriptor to the descriptor ring in host memory. 12 the hardware moves the transmit descriptor head poin ter (tdh) in the direction of the tail to point to the next descriptor in the ring. 13 after the entire packet is fetched by the hardware an interrupt might be generated by the hardware to notify the software device driver that it can release the relevant buffers to the operating system.
145 inline functions?82574 gbe controller ? software places the rest of the data to be transmitted in the host memory indicated to the hardware by additional data descriptors. ? hardware splits the data into multiple packets according to the maximum segment size (mss) defined in the context descripto r. hardware uses the prototype header for each packet while it auto-updates some of the fields in the ip and tcp headers. see more details in section 7.3.6.2 . ? for each packet, the proceeding steps are the same as the legacy tx descriptors as previously described (starting at step number 4). 7.2.4 transmit descriptor ring structure the transmit descriptor ring is described by the following registers: ? transmit descriptor base address register (tdba) ? this register indicates the start address of the descriptor ring buffer in the host memory; this 64-bit address is aligned on a 16-byte boundary and is stored in two consecutive 32-bit registers. hardware ignores the lower four bits. ? transmit descriptor length register (tdlen) ? this register determines the number of bytes allocated to the circular ring. this value must be aligned to 128 bytes. ? transmit descriptor head register (tdh) ? this register holds an index value that indicates the in-progress descriptor. there can be up to 64 kb descriptors in the circular buffer. reading this register returns the value of head corresponding to descriptors already loaded in the transmit fifo. ? transmit descriptor tail register (tdt) ? this register holds a value, which is an offset from the base (tdba), and indicates the location beyond the last descriptor hardware can process. this is the location where software writes the next new descriptor. figure 30. transmit descriptor ring structure base tdba base+1 base + tdlen head tdh tail tdt
82574 gbe controller?inline functions 146 descriptors between the head and the tail po inters are descriptors that have been prepared by software and are owned by hardware. 7.2.4.1 transmit descriptor fetching the descriptor processing strategy for transmit descriptors is essentially the same as for receive descriptors. when the on-chip descriptor queue is empty, a fetch occurs as soon as any descriptors are made available (host writes to the tail pointer). hardware might elect to perform a fetch which is not a multiple of cache line si ze. the hardware performs this non-aligned fetch if doing so results in the next descriptor fetch being aligned on a cache line boundary. this enables the descriptor fetch mechanism to be most efficient in the cases where it has fallen behind software. after the initial fetch of descriptors, as the on-chip buffer empties, the hardware can decide to pre-fetch more transmit descriptor s if the number of on-chip descriptors drop below txdctl.pthresh and enough va lid descriptors txdct is performed. note: the 82574 never fetches descriptors beyond the descriptor tail pointer. 7.2.4.2 transmit descriptor write back the descriptor write-back policy for transmit descriptors is similar to that for receive descriptors with a few additional factors. there are three factors: the report status ( rs ) bit in the transmit descriptor, the write back threshold (txdct l.wthresh) and the interrupt delay enable ( ide ) bit in the transmit descriptor. descriptors are written back in one of three cases: ? txdctl.wthresh = zero, ide = zero and a descriptor with rs set to 1b is ready to be written back, for this condition write backs are immediate. the device writes back only the status byte of the descripto r (tdescr.sta) and all other bytes of the descriptor are left unchanged. ? ide = 1b and the transmit interrupt delay (tidv) register timer expires, this timer is used to force a timely write back of descriptors. timer expiration flushes any accumulated descriptors and sets an interrupt event. ? txdctl.wthresh > zero and the write back of the full descriptors are performed only when txdctl.wthresh number of de scriptors are ready for a write back.
147 inline functions?82574 gbe controller 7.2.4.3 determining completed frames as done software can determine if a packet has been sent by the following method: ? setting the rs bit in the transmit descriptor command field and checking the dd bit of the relevant descriptors in host memory. the process of checking for completed descriptors consists of the following: ? the software device driver scans the host memory for the value of the dd status bit. when the dd bit =1b, indicates a completed packet, and also indicates that all packets preceding that packet have been put in the output fifo. 7.2.5 multiple transmit queues the 82574 supports two transmit descriptor ring s. each ring functionality is according to the description in section 7.2.4 . when software enables the two transmit queues, it also must enable the multiple request support in the tctl register. the priority and arbitration between the qu eues can be set and specified using the tarc registers in the memory space (see section 10.2.6.9 ). this feature is intended to enable the supp ort for quality of service (qos), supporting 802.1p, while classifying packets into different priority queues. 7.2.6 overview of on-chip transmit modes transmit mode is used to refer to a set of configurations that support some of the transmit path offloads. these modes are updated and controlled with the transmit descriptors. there are three types of transmit modes: ?legacy mode ? extended mode ?segmentation mode the first mode (legacy) is an implied mode as it is not explicitly specified with a context descriptor. this mode is constructed by the de vice from the first and last descriptors of a legacy transmit and from some internal constants. the legacy mode enables insertion of one checksum. the other two modes are indicated explicitly by a transmit context descriptor. the extended mode is used to control the checksum offloading feature for packet transmission. the segmentation mode is used to control the packet segmentation capabilities of the device. the tse bit, in the context descriptor, selects which mode is updated, that is, extended mode or segmentation mode. the extended and segmentation modes enable insertion of two checksums. in addition, the segmentation mode adds information specific to the segmentation capability.
82574 gbe controller?inline functions 148 the device automatically selects the appropriate mode to use based on the current packet transmission: legacy, extended, or segmentation. note: while the architecture supports arbitrary ordering rules for the various descriptors, there are restrictions including: ? context descriptors should not occur in the middle of a packet or of a segmentation. ? data descriptors of different packet types (legacy, extended, or segmentation) should not be intermingled except at the packet (or segmentation) level. there are dedicated resources on-chip for bo th the extended and segmentation modes. these modes remain constant until they are modified by another context descriptor. this means that a set of configurations relevant to one mode can (and will) be used for multiple packets unless a new mode is loaded prior to sending a new packet. note: when working with two descriptor queues in the 82574, the software needs to rewrite the context descriptor for each packet as it can't know if the second queue transmission had modified the on-chip context or not. the hardware keeps track of only the last context descriptor that was written. 7.2.7 pipelined tx data read requests transmit data request pipelining is the process by which a request for transmit data is sent to the host memory before the read dm a request of the previously requested data completes. transmit pipeline requests is enabled using the mulr bit in the transmit control (tctl) register, its initial value is loaded from the nvm. the 82574 supports four pipelined requests from the tx data dma. in general, the four requests can belong to the same packet or to consecutive packets. however, the following restrictions apply: ? all requests for a packet are issued before a request is issued for a following packet. ? if a request (for the following packet) requires context change, the request for the following packet is not issued until the previous request is completed (such as, no pipeline across contexts). the pcie specification does not ensure that completions for separate requests return in order. the 82574 can handle completions that arrive in any order. the 82574 incorporates a 2 kb buffer to suppo rt re-ordering of completions for the four requests. each request/completion can be up to 512 bytes long. the maximum size of a read request is defined as follows: ? when the mulr bit is cleared, maximum request size in bytes is the min{2k, max_read_request_size} ? when the mulr bit is set, maximum request size in bytes is the min{512, max_read_request_size} note: in addition to the four pipeline requests fr om the tx data dma, the 82574 can issue a single read request from each of the 2 tx descriptor and 2 rx descriptor dma engines. the requests from the three sources (tx da ta, tx descriptor and rx descriptor) are independently issued. each descriptor read request can fetch up to 16 descriptors (equal to 256 bytes of data).
149 inline functions?82574 gbe controller 7.2.8 transmit interrupts hardware supplies the transmit interrupt s described below. these interrupts are initiated via the following conditions: ? transmit descriptor ring empty (icr.txqe) - all descriptors have been processed. the head pointer is equal to the tail pointer. ? any write backs are performed; either with the rs bit set or when accumulated descriptors are written back when tx dctl.wthresh descriptors have been completed and accumulated; transmit descriptor write back (icr.txdw). ? transmit delayed interrupt (icr.txdw) - in conjunction with interrupt delay enable (ide), the txdw indication is delayed per the tidv and/or tadv registers. the interrupt is set when one of the tran smit interrupt countdown timers expire. a transmit delayed interrupt is scheduled for a transmit descriptor with its rs bit set and the ide bit set. when a transmit delayed interrupt occurs, the txdw interrupt bit is set (just as when a transmit descr iptor write-back interrupt occurs). this interrupt can be masked in the same manner as the txdw interrupt. this interrupt is used frequently by software that perf orms dynamic transmit chaining by adding packets one at a time to the transmit chain. note: the transmit delay interrupt is indicated wi th the same interrupt bit as the transmit write-back interrupt, txdw. the transmit delay interrupt is only delayed in time as previously discussed. note: in msi-x mode, the ide bit in the transmit descriptor should not be set. ? transmit descriptor ring low threshold hit (icr.txd_low) - set when the total number of transmit descriptors available hits the low threshold specified in the txdctl.lwthresh field in the transmit descriptor control register. for the purposes of this interrupt, number of transmit descriptors available is the difference between the transmit descriptor tail and transmit descriptor head values, minus the number of transmit descriptors that have been pre-fetched. up to eight descriptors can be pre-fetched. 7.2.8.1 delayed transmit interrupts this mechanism allows software the flexibility of delaying transmit interrupts in order to allow more time for new descriptors to be written to the memory ring and potentially prevent an interrupt when the device's head pointer catches the tail pointer. this feature is desirable, because a software device driver usually has no knowledge of when it is going to be asked to send another frame. for performance reasons, it is best to generate only one transmit interrupt after a burst of packets have been sent. 7.2.9 transmit data storage data is stored in buffers pointed to by th e descriptors. alignment of data is on an arbitrary byte boundary with the maximum size per descriptor limited only to the maximum allowed length size. a packet typically consists of two (or more) descriptors, one (or more) for the header and one (or more) for the actual data. some software implementations copy the header(s) and pack et data into one buffer and use only one descriptor per transmitted packet.
82574 gbe controller?inline functions 150 7.2.10 transmit descriptor formats the original descriptor is referred to as the legacy descriptor and is described in section 7.2.10.1 . the two new descriptor types are collectively referred to as extended descriptors. one of the new descriptor types is quite similar to the legacy descriptor in that it points to a block of packet data. this descriptor type is called the extended data descriptor. the other new descriptor type is f undamentally different as it does not point to packet data. this descriptor type is called the context descriptor. it only contains control information, which is loaded into registers of the 82574, and affects the processing of future packets. the following paragraphs describe the three descriptor formats. the new descriptor types are specified by setting the tdesc.dext bit to 1b. if this bit is set, the tdesc.dtyp field is examined to determine the descriptor type (extended data or context). figure 32 shows the context descriptor generic layout. figure 34 shows the data descriptor generic layout. 7.2.10.1 legacy transmit descriptor format figure 31. legacy transmit descriptor format the legacy tx descriptor is defined by setting the dext bit in the command field to 0b. the legacy tx descriptor format is shown in figure 31 . 7.2.10.1.1 buffer address the buffer address (tdesc.buffer address) sp ecifies the location (address) in main memory of the data to be fetched. 0 15 16 23 24 31 32 35 36 39 40 47 48 63 length cso cmd sta extcmd css vlan buffer address [63:0] checksum start checksum offset 765432 1 0 ide vle dext rsv rs ic ifcs eop 8 status command 32 1 0 0 vlan 13 12 11 15 cfi pri vlan css 0 8 0 cso 0 8 length 15 extcmd 0 3 length rsv dd res res 1 ts
151 inline functions?82574 gbe controller 7.2.10.1.2 length length (tdesc.length) specif ies the length in bytes to be fetched from the buffer address. the maximum length associated with any single legacy descriptor is 16288 bytes. note: the maximum allowable packet size for transmits might change based on the value configured for the transmit fifo size wri tten to the packet buffer allocation (pba) register. for any individual packet, the sum of the individual descriptors' lengths must be at least 80 bytes less than the a llocated size of the transmit fifo. 7.2.10.1.3 checksum offset and checksum start - cso and css the checksum start (tdesc.css) field indicates where to begin computing the checksum. css must be set in the first descriptor of a packet. the checksum offset (tdesc.cso) field indicates where to insert the tcp checksum, relative to the start of the packet. both cso and css are in units of bytes while they must be within the range of data provided to the device in the descrip tor. this means, for short packets that are padded by software, css and cso must be in the range of the unpadded data length, not the eventual padded length (64 bytes). note: cso must be set in the last descriptor of the packet. only when eop is set does the hardware interpret insert checksum (ic), and cso bits. in the case of 802.1q header, the offset valu es depend on the vlan insertion enable bit - ctrl.vme and the vle bit. when the ct rl.vme and the vle bit are not set (vlan tagging included in the packet buffers), th e offset values should include the vlan tagging. when these bits are set (vlan tagging is taken from the packet descriptor), the offset values should exclude the vlan tagging. note: although the 82574 can be programmed to calculate and insert tcp checksum using the legacy descriptor format as previously described, it is recommended that software use the newer context descriptor format. this newer descriptor format enables hardware to calculate both the ip and tcp checksums for outgoing packets. see section 7.2.7 for more information about how the new descriptor format can be used to accomplish this task. note: udp checksum calculation is not supported by the legacy descriptor. note: as the cso field is eight bits wide, it limit s the location of the checksum to 255 bytes from the beginning of the packet. software must compute an offsetting entry and store it in the position where the hardware computed checksum is to be insert ed. this offset is needed to back out the bytes of the header that should no t be included in the tcp checksum. 7.2.10.1.4 command byte - cmd the cmd byte stores the applicable command and has the fields shown in ta b l e 3 6 . table 36. command byte fields 7 6 5 4 3 2 1 0 ide vle dext rsv rs ic ifcs eop
82574 gbe controller?inline functions 152 ide (bit 7) - interrupt delay enable vle (bit 6) - vlan packet enable dext (bit 5) - descriptor extension (0b for legacy mode) rsv (bit 4) - reserved rs (bit 3) - report status ic (bit 2) - insert checksum ifcs (bit 1) - insert fcs (crc) eop (bit 0) - end of packet ide activates a transmit interrupt delay ti mer. hardware loads a countdown register when it writes back a transmit descriptor that has rs and ide set. the value loaded comes from the idv field of the interrupt delay (tidv) register. when the count reaches zero, a transmit interrupt occurs if transmit descriptor write-back interrupts (txdw) are enabled. hardware always loads the transmit interrupt counter whenever it processes a descriptor with ide set even if it is already counting down due to a previous descriptor. if hardware encounters a descriptor that has rs set, but not ide, it generates an interrupt immediately after wr iting back the descriptor and clears the interrupt delay timer. setting the ide bit has no meaning without setting the rs bit. note: although the transmit interrupt might be delayed, the descriptor write-back requested by setting the rs bit is performed without de lay unless descriptor write-back bursting is enabled. vle indicates that the packet is a vlan packet (for example, that the hardware should add the vlan ether type and an 802.1q vlan tag to the packet). note: if the vle bit is set, the ctrl.vme bit should also be set to enable vlan tag insertion. table 37. vlan tag insertion decision tabl e when vlan mode enabled (ctrl.vme=1b) the dext bit identifies this descriptor as either a legacy or an extended descriptor type and must be set to 0b to indicate legacy descriptor. when the rs bit is set, hardware writes back the dd bit once the dma fetch completes. note: descriptors with the null address (0), or zero length, transfer no data. if they have the rs bit in the command byte set, then the dd field in the status word is written when hardware processes them. hardware only sets the dd bit for descriptors with rs set. note: the software can set the rs bit in each descriptor or, more likely, in specific descriptors such as the last descriptor of each packet. vle action 0 send generic ethernet packet. ifcs controls insertion of fcs in normal ethernet packets. 1 send 802.1q packet; the ethernet type field comes from the vet register and the vlan data comes from the special field of the tx descriptor; hardware appends the fcs/crc - command should reflect by setting ifcs to 1b.
153 inline functions?82574 gbe controller when ic is set, hardware inserts a checksum value calculated from the css bit value to the cse bit value, or to the end of packet. the checksum value is inserted in the header at the cso bit value location. one or many descriptors can be used to form a packet. checksum calculations are for the entire packet starting at the byte indicated by the css field. a value of zero for css corresponds to the first byte in the packet. css must be set in the first descriptor for a packet. in addition, ic is ignored if cso or css are out of range. this occurs if ( ) or ( ). when ifcs is set, hardware appends the mac fcs at the end of the packet. when cleared, software should calculate the fcs for proper crc check. the software must set ifcs in the following instances: ? transmission of short packets while padding is enabled by the tctl.psp bit ? checksum offload is enabled by the ic bit in the tdesc.cmd ? vlan header insertion enabled by the vle bit in the tdesc.cmd ? large send or tcp/ip checksum offload using context descriptor eop stands for end-of-packet and when set, indicates the last descriptor making up the packet. note: vle, ifcs, cso, and ic are qualified by eop. in other words, hardware interprets these bits only when the eop bit is set. 7.2.10.1.5 extended command - extcmd rsv (bit 3:1) - reserved ts (bit 0) - time stamp the ts bit indicates to the 82574 to put a ti me stamp on the packet designated by the descriptor. 7.2.10.1.6 status - sta rsv (bit 3:1) - reserved dd (bit 0) - descriptor done status dd indicates that the descriptor is done an d is written back after the descriptor has been processed (assuming the rs bit was set). the dd bit can be used as an indicator to the software that all descriptors, in the memory descriptor ring, up to and including the descriptor with the dd bit set are again available to the software. css length ? ? ? 321 0 rsv ts 321 0 rsv dd
82574 gbe controller?inline functions 154 7.2.10.1.7 vlan field the vlan field is used to provide the 802.1q/802.1ac tagging information. the vlan field is ignored if the vle bit is 0b or if the eop bit is 0b. 7.2.10.2 context transmit descriptor format figure 32. context transmit descriptor format the context descriptor provides access to the enhanced checksum off load and tcp segmentation features available in the 82574. a context descriptor differs from a data descriptor as it does not point to packet data. instead, this descriptor provides access to on-chip contexts that support the transmit checksum offloading or the segmentation feature of the 82574. a context refers to a set of parameters loaded or unloaded as a group to provide a particular function. to select this descriptor format, the dext bit in the command field should be set to 1b and tdesc.dtyp should be set to 0x0000. in this case, the descriptor format is defined as shown in figure 32 . 7.2.10.2.1 ip and tcp/udp checksum control the first qword of this descriptor type cont ains parameters used to calculate the two checksums, which can be offloaded. 15 13 12 11 0 pri cfi vlan tag 0 7 8 15 16 31 32 39 40 47 48 63 paylen dtyp tucmd sta mss 765432 1 0 ide snap dext rsv rs tse ip tcp 8 status rsv dd command 32 1 0 ipcss ipcso ipcse tucss tucso tucse 0 hdrlen rsv 0 19 20 31 32 39 40 47 48 63 36 35 23 24 tucse, tucss, tucso are tcp/udp checksum controls ipcse, ipcss, ipcso are ip checksum controls dext must =1 for context descriptor format 32 1 0 dtyp dtyp must =0000 for context descriptor format
155 inline functions?82574 gbe controller ipcss - ip checksum start - specifies the byte offset from the start of the dma'd data to the first byte to be included in the chec ksum. setting this value to 0b means the first byte of the data would be included in the ch ecksum. this field is limited to the first 256 bytes of the packet and must be less than or equal to the total length of a given packet. if this is not the case, the results are unpredictable. ipcso - ip checksum offset - specifies wh ere the resulting checksum should be placed. this field is limited to the first 256 bytes of the packet and must be less than or equal to the total length of a given packet. if this is not the case, the checksum is not inserted. ipcse - ip checksum end - specifies where the checksum should stop. a 16-bit value supports checksum off loading of packets as large as 64 kb. setting the ipcse field to all zeros means eop. in this way, the length of the packet does not need to be calculated. note: when doing checksum or tcp segmentation with ipv6 headers ipcse field should be set to 0x0000, ipcss should be valid as in ipv4 packet and the ixsm bit in the data descriptor should be cleared. note: for proper ip checksum calculation, the ip header checksum field should be set to zero unless some adjustment is needed by the driver. similarly, tucss, tucso, tucse specify the same parameters for the tcp or udp checksum. note: for proper tcp/udp checksum calculation the tcp/udp checksum field should be set to the partial pseudo-header checksum value. in case of 802.1q header, the offset values depend on the vlan insertion enable bit - ctrl.vme. when the ctrl.vme is not set (vlan tagging included in the packet buffers), the offset values sh ould include the vlan tagging. when the ctrl.vme is set (vlan tagging is taken from the packet descriptor), the offset values should exclude the vlan tagging. note: when setting the tcp segmentation context, ipcss and tucss are used to indicate the start of the ip and tcp headers respectively, and must be set even if checksum insertion is not desired. in certain situations, software might need to calculate a partial checksum (the tcp pseudo-header for instance) to include bytes that are not contained within the range of start and end. if this is the case, this part ial checksum should be placed in the packet data buffer, at the appropriate offset for the checksum. if no partial checksum is required, software must write a value of zero at this offset. 7.2.10.3 max segment size - mss mss controls the maximum segment size. this specifies the maximum tcp or udp payload segment sent per frame, not includ ing any header. the total length of each frame (or section) sent by the tcp segm entation mechanism (excluding 802.3ac tagging and ethernet crc) is mss bytes + hrdlen. the one exception is the last packet of a tcp segmentation that might be shorter. this field is ignored if tdesc.tse is not set.
82574 gbe controller?inline functions 156 7.2.10.3.1 header length - hdrlen hdrlen is used to specify the length (in byte s) of the header to be used for each frame of a tcp segmentation operation. the first hdrlen bytes fetched from data descriptor(s) are stored internally and are used as a prototype header. the prototype header is updated for each packet and is prepended to the packet payload. for udp packets this will normally be equal to udp checksum offset + 2. for tcp messages it will normally be equal to tcp checksum offset + 4 + tcp header option bytes. this field is ignored if tdesc.tse is not set. maximum limits for the hdrlen and mss fields are dictated by the lengths variables. however, there is a further restriction that for any tcp segmentation operation, the hardware must be capable of storing a complete framed fragment (completely-built frames) in the transmit fifo prior to transmission. therefore, the output tx fifo (packet buffer) should at least have (mss + hdrlen) space available. in addition mss must be set to a value more than 0x10 an d hdrlen must be smaller than 256 bytes. 7.2.10.4 payload - paylen the packet length field (paylen) is the total number of payload bytes for this tcp segmentation offload (for ex ample, the total number of payload bytes includes those that are distributed across multiple frames after tcp segmentation is performed). following the fetch of the prototype header, pa ylen specifies the leng th of data that is fetched next from data descrip tor(s). this field is also used to determine when last- frame processing needs to be performed. the paylen specification does not include any header bytes. this field is ignored if tdesc.tse is not set. note: there is no restriction on the overall paylen specification with respect to the transmit fifo size, once the mss and hdrlen specifications are legal. 7.2.10.5 descriptor type - dtyp setting the descriptor type (tdesc.dtyp) field to 0x0000 identifies this descriptor as a context descriptor. 7.2.10.6 command - tucmd the command field (tdesc.tucmd) provides options that control the checksum offloading and tcp segmentation features, along with some of the generic descriptor processing functions. ta b l e 3 8 lists the bit definitions for the tdesc.tucmd field. the ide, dext, and rs bits are valid regardless of the state of tse. all other bits are ignored if tse=0b. table 38. command tucmd fields 7 6 5 4 3 2 1 0 ide snap dext rsv rs tse ip tcp
157 inline functions?82574 gbe controller ide (bit 7) - interrupt delay enable snap (bit 6) - snap dext (bit 5) - descriptor extension (must be 1b for this descriptor type) rsv (bit 4) - reserved rs (bit 3) - report status tse (bit 2) - tcp segmentation enable ip (bit 1) - ip packet type (ipv4=1b, ipv6=0b) tcp (bit 0) - packet type (tcp=1b,udp=0b) ide activates a transmit interrupt delay timer. hardware loads a countdown register when it writes back a transmit descriptor that has rs and ide set. the value loaded comes from the idv field of the interrupt delay (tidv) register. when the count reaches zero, a transmit interrupt occurs if transmit descriptor write-back interrupts (txdw) are enabled. hardware always loads the transmit interrupt counter whenever it processes a descriptor with ide set even if it is already counting down due to a previous descriptor. if hardware encounters a descriptor that has rs set, but not ide , it generates an interrupt immediately after wr iting back the descriptor and clears the interrupt delay timer. setting the ide bit has no meaning without setting the rs bit. note: although the transmit interrupt may be dela yed, the descriptor write-back requested by setting the rs bit is performed without delay unless descriptor write-back bursting is enabled. snap indicates that the tcp segmentation mac header includes a snap header that needs to be updated by hardware. the dext bit identifies this descriptor as one of the extended descriptor types and must be set to 1b. when the rs bit is set, hardware writes back the dd bit once the dma fetch completes. note: descriptors with the null address (0), or zero length, transfer no data. if they have the rs bit in the command byte set, then the dd field in the status word is written when hardware processes them. hardware only sets the dd bit for descriptors with rs set. note: software can set the rs bit in each descriptor or, more likely, in specific descriptors such as the last descriptor of each packet. tse indicates that this descriptor is setting the tcp segmentation context. if this bit is zero, the descriptor defines a single packet tcp/udp, ip checksum offload mode. when a descriptor of this type is processed, th e device immediately updates the mode in question (tcp segmentation or checksum o ffloading) with values from the descriptor. this means that if any normal packets or tcp segmentation packets are in progress (a descriptor with eop set has not been received for the given context) the results will likely be undesirable. the ip bit is used to indicate what type of ip (ipv4 or ipv6) packet is used in the segmentation process. this is necessary for the 82574 to know where the ip payload length field is located. this does not override the checksum insertion bit, ixsm. the ip bit must only be set for ipv4 pa ckets and cleared for ipv6 packets.
82574 gbe controller?inline functions 158 the tcp bit identifies the packet as either tcp or udp (non-tcp). this affects the processing of the header information. 7.2.10.7 status - sta four bits are reserved to provide transmit status, although only one is currently assigned for this specific descriptor type. the status word will only be written back to host memory in cases where the rs bit is set in the command. dd indicates that the descriptor is done and is written back after the descriptor has been processed only if the rs bit was set. figure 33. transmit status layout rsv (bits 3-1) - reserved dd (bit 0) - descriptor done 7.2.11 extended data descriptor format figure 34. extended data descriptor format the extended data descriptor is the compan ion to the context descriptor described in the previous section. this descriptor type points to the location of the data in the host memory. to select this descriptor format, bit 29 (tdesc.dext) must be set to 1b and tdesc.dtyp must be set to 0x0001. in this case, the descriptor format is defined as shown in figure 34 . the first qword of this descriptor type contains the address of a data buffer in host memory. this buffer contains all or a portion of a transmit packet. the second qword of this descriptor contains information about the data pointed to by this descriptor as well as descriptor processing options. 32 1 0 reserved dd 20 23 31 24 35 32 36 39 40 47 48 63 dtalen dtyp dcmd sta vlan 76 5 43 2 1 0 ide vle dext rsv rs tse ifcs eop 8 status dd command 3 1 0 0 3 addresses 0 popts extcmd rsv 0 19 rsv 7 21 0 rsv txsm ixsm 0 11 vlan id 12 cfi 13 15 pri 1 ts
159 inline functions?82574 gbe controller 7.2.11.1 data length - dtalen the data length field (tdesc.dtalen) is the total length of the data pointed to by this descriptor (the entire send), in bytes. for data descriptors not associated with a tcp segmentation operation (tdesc .tse not set), the descriptor lengths are subject to the same restrictions specified for legacy descrip tors (the sum of the lengths of the data descriptors comprising a single packet must be at least 80 bytes less than the allocated size of the transmit fifo). 7.2.11.2 descriptor type - dtyp setting the descriptor type (tdesc.dtyp) fiel d to 0x0001 identifies this descriptor as an extended data descriptor. 7.2.11.3 command - dcmd the command field (tdesc.dcmd) provides options that control the checksum offloading tcp segmentation features, al ong with some of the generic descriptor processing features. ta b l e 3 9 lists the bit definitions for the dcmd field. table 39. command dcmd fields ide (bit 7) - interrupt delay enable vle (bit 6) - vlan enable dext (bit 5) - descriptor extension (must be 1b for this descriptor type) rsv (bit 4) - reserved rs (bit 3) - report status tse (bit 2) - tcp segmentation enable ifcs (bit 1) - insert fcs (also controls insertion of ethernet crc) eop (bit 0) - end of packet ide activates a transmit interrupt delay time r. hardware loads a countdown register when it writes back a transmit descriptor that has rs and ide set. the value loaded comes from the idv field of the interrupt delay (tidv) register. when the count reaches zero, a transmit interrupt occurs if transmit descriptor write-back interrupts (txdw) are enabled. hardware always loads the transmit interrupt counter whenever it processes a descriptor with ide set even if it is already counting down due to a previous descriptor. if hardware encounter s a descriptor that has rs set, but not ide , it generates an interrupt immediately after wr iting back the descriptor and clears the interrupt delay timer. setting the ide bit has no meaning without setting the rs bit. 7 6 5 4 3 2 1 0 ide vle dext rsv rs tse ifcs eop
82574 gbe controller?inline functions 160 although the transmit interrupt might be delayed, the descriptor write-back requested by setting the rs bit is performed without de lay unless descriptor write-back bursting is enabled. vle indicates that the packet is a vlan packet (for example, that the hardware should add the vlan ether type and an 802.1q vlan tag to the tcp message). note: if the vle bit is set to enable vlan tag insertion, the ctrl.vme bit should also be set. the dext bit identifies this descriptor as one of the extended descriptor types and must be set to 1b. when the rs bit is set, the hardware writes back the dd bit once the dma fetch completes. note: descriptors with the null address (0), or zero length, transfer no data. if they have the rs bit in the command byte set, then the dd field in the status word is written when hardware processes them. hardware only sets the dd bit for descriptors with rs set. software can set the rs bit in each descripto r or, more likely, in specific descriptors such as the last descriptor of each packet. tse indicates that this descriptor is part of the current tcp segmentation context. if this bit is not set, the descriptor is part of the normal non-segmentation context. ifcs controls insertion of the ethernet crc. the packet fcs covers the tcp/ip headers. therefore, when using the tcp segmentation offload, software must also use the fcs insertion. note: the vle, ifcs, and vlan fields are only valid in certain descriptors. if tse is enabled, the vle, ifcs, and vlan fields are only va lid in the first data descriptor of the tcp segmentation context. if tse is not enabled, then these fields are only valid in the last descriptor of the given packet (qualified by the eop bit). eop when set, indicates the last descriptor making up the packet. table 40. vlan tag insertion decision table vle action 0 send generic ethernet packet. ifcs controls insertion of fcs in normal ethernet packets. 1 send 802.1q packet; the ethernet type field comes from the vet register and the vlan data comes from the special field of the tx descriptor; hardware always appends the fcs/crc.
161 inline functions?82574 gbe controller 7.2.11.4 status - sta the status field is written back to host memory in cases where the rs bit is set in the command field. the dd bit indicates that the descriptor is done after the descriptor has been processed. rsv (bit 3:1) - reserved dd (bit 0) - descriptor done 7.2.11.5 extended command the extended command field (tdesc.extcmd) provides additional control options. ta b l e 4 1 lists the bit definitions for the dcmd field. table 41. transmit extended command (tdesc.extcmd) layout timestamp (bit 0) - indication to stamp the transmitted packet time for timesync. 7.2.11.6 packet options - popts the popts field provides a number of options, which control the handling of this packet. this field is relevant only on the first data descriptor of a packet or segmentation context. rsv (bits 7:2) - reserved txsm (bit 1) - insert tcp/udp checksum ixsm (bit 0) - insert ip checksum ixsm and txsm are used to control inse rtion of the ip and tcp/udp checksums, respectively. if the corresponding bit is not set, whatever value software has placed into the checksum field of the packet data is placed on the wire. note: for proper values of the ip and tcp checksum, software must set the ixsm and txsm when using the transmit segmentation. note: software should not set this field for ipv6 packets. 321 0 rsv dd 321 0 reserved timestamp 7 2 1 0 rsv txsm ixsm
82574 gbe controller?inline functions 162 7.2.11.7 vlan the vlan field is used to provide the 802.1q tagging information. the special field is ignored if the vle bit in th e dcmd command byte is 0b. 7.3 tcp segmentation tcp segmentation is an offloading option of the tcp/ip stack. this is often referred to as transmit segmentation offloading (tso). this feature obligates the software device driver and hardware to carve up tcp messages, larger than the maximum transmission unit (mtu) of the medium, into mss sized frames that have appropriate layer 2, 3 (ip), and 4 (tcp) headers. these headers must have the correct sequence number, ip identification, checksum fields, options and flag values as required. this is done by breaking up the data into segments smaller than or equal to the mss. note: note that some of these values (such as the checksum values) are unique for each packet of the tcp message, and other fields such as the source ip address are constant for all frames associated with the tcp message. the offloading of these mechanisms to the software device driver and the 82574 saves significant cpu cycles. the software device dr iver shares the additional tasks to support these options with the 82574. 7.3.1 tcp segmentation performance advantages performance advantages for a hardware impl ementation of tcp segmentation offload include: ? the stack does not need to partition th e block to fit the mtu size, saving cpu cycles. ? the stack only computes one ethernet, ip, and tcp header per segment (entire packet), saving cpu cycles. ? the stack interfaces with the software device driver only once per block transfer, instead of once per frame. ? interrupts are easily reduced to once per tcp message instead of once per frame. ? fewer i/o accesses are required to command the the 82574. note: tcp segmentation requires the transmit context descriptor format and the transmit data descriptor format. 7.3.2 ethernet packet format a tcp message can be fragmented across multiple pages in host memory. the 82574 partitions the data packet into standard ethernet frames prior to transmission. the 82574 supports calculating the ethernet, ip, tcp, and udp headers, including checksum, on a frame-by-frame basis. 15 13 12 11 0 pri cfi vlan id
163 inline functions?82574 gbe controller figure 35. tcp/ip packet format frame formats supported by the 82574 include: ? ethernet 802.3 ? ieee 802.1q vlan (ethernet 802.3ac) ? ethernet type 2 ? ethernet snap ? ipv4 headers with options ? ipv6 headers with ip option next headers ? tcp with options ? udp with options vlan tag insertion is handled by hardware. note: ip tunneled packets are not supported for tso operation. once the tcp segmentation context has been set, the next descriptor provides the initial data to transfer. this first descriptor(s must point to a packet of the type indicated. furthermore, the data it points to might need to be modified by software as it serves as the prototype (partial pseudo-header) header for all packets within the tcp segmentation context. the following sections describe the supported packet types and the various updates which are performed by ha rdware. this should be used as a guide to determine what must be modified in the or iginal packet header to make it a suitable prototype (partial pseudo-header) header. 7.3.3 tcp segmentation data descriptors the tcp segmentation data descriptor is the companion to the tcp segmentation context descriptor described in the previous section. for a complete description of the descriptor please refer to section 7.2.11 . to select this descriptor format, bit 29 (tdesc.dext) must be set to 1b and tdesc.dtyp must be set to 0x0001. l2 l3 l4 ethernet ip tcp data fcs
82574 gbe controller?inline functions 164 7.3.4 tcp segmentation source data once the tcp segmentation context has been set, the next descriptor (data descriptor) provides the initial data to transfer. this first data descriptor must point to data containing an ethernet header of the type indicated. the 82574 fetches the prototype (partial pseudo-header) header from the host data buffer into an internal buffer and this header is prepended to every packet for this tso operation. the prototype (partial pseudo-header) header is modified acco rdingly for each mss sized segment. the following sections describe the supported pa cket types and the various updates that are performed by hardware. this should be used as a guide to determine what must be modified in the original packet header to ma ke it a suitable prototype (partial pseudo- header) header. the following summarizes the fields considered by the driver for modification in constructing the prototype (partial pseudo-header) header. mac header (for snap) ? mac header len field should be set to 0b. ipv4 header ? length should be set to zero. ? identification field should be set as appr opriate for first packet of send (if not already). ? header checksum should be zeroed out unless some adjustment is needed by the software device driver. ipv6 header ? length should be set to zero. tcp header ? sequence number should be set as appropriate for first packet of send (if not already). ? psh, and fin flags should be set as appropriate for last packet of send. ? tcp checksum should be set to the partial pseudo-header checksum. udp header ? udp checksum should be set to the partial pseudo-header checksum. the 82574's dma function fetches the ip, and tcp/udp prototype (partial pseudo- header) header information from the initia l descriptor(s) and save them on-chip for individual packet header generation. 7.3.5 hardware performed u pdating for each frame the following sections describe the updating process performed by the hardware for each frame sent using the tcp segmentation capability.
165 inline functions?82574 gbe controller 7.3.6 tcp segmentation use of multiple data descriptors tcp segmentation enables a series of data descriptors, each referencing a single physical address page, to reference a large pa cket contained in a single virtual-address buffer. the only requirement on use of multiple da ta descriptors for tcp segmentation is as follows: ? if multiple data descriptors are used to describe the ip/tcp/udp header section, each descriptor must describe one or more complete headers; descriptors referencing only parts of headers are not supported. note: it is recommended that the entire he ader section, as described by the tcp context descriptor hdrlen field, be coalesced into a single buffer and described using a single data descriptor. if all the layer headers (l2- l4) are not coalesced into a single buffer, each buffer must not cross a 4 kb boundar y, or be bigger than max_read_request. 7.3.6.1 transmit checksum offloa ding with tcp segmentation the 82574 supports checksum offloading as a component of the tcp segmentation offload feature and as a standalone capability. the 82574 supports ip and tcp/udp header options in the checksum computation for packets that are derived from the tcp segmentation feature. note: the 82574 is capable of computing one level of ip header checksum and one tcp/udp header and payload checksum. in case of multiple ip headers, the software device driver has to compute all but one ip header checksum. the 82574 calculates checksums on the fly on a frame-by-frame ba sis and inserts the result in the ip/tcp/ udp headers of each frame. tcp and udp ch ecksum are a result of performing the checksum on all bytes of the payload and the pseudo header. three specific types of checksum are suppor ted by the hardware in the context of the tcp segmentation off load feature: ? ipv4 checksum (ipv6 does not have a checksum) ? tcp checksum ?udp checksum each packet that is sent via the tcp segmentation offload feature optionally includes the ipv4 checksum and either the tcp or udp checksum. all checksum calculations use a 16-bit wide ones complement checksum. the checksum word is calculated on the outgoing data. the checksum field is written with the 16-bit ones complement sum of all 16 -bit words in the range of css to cse, including the checksum field itself.
82574 gbe controller?inline functions 166 7.3.6.2 ip/tcp/udp header updating ip/tcp/udp header is updated for each ou tgoing frame based on the ip/tcp header prototype (partial pseudo-header) which the hardware gets from the first descriptor(s) and stores on chip. the ip/tcp/udp headers are fetched from host memory into an on- chip 240 byte header buffer once for each tcp segmentation context (for performance reasons, this header is not fetched for each additional packet that will be derived from the tcp segmentation process). the checksum fields and other header information are updated on a frame-by-frame basis. the up dating process is performed concurrently with the packet data fetch. 7.3.6.2.1 tcp/ip/udp header for the first frame the hardware makes the following changes to the headers of the first packet that is derived from each tcp segmentation context. mac header (for snap) ? type/len field = mss + hdrlen - 14 ipv4 header ?ip total length = mss + hdrlen - ipcss ?ip checksum ipv6 header ? payload length = mss + hdrlen - ipcss - ipv6size (while ipv6size = 40bytes) tcp header ? sequence number: the value is the sequence number of the first tcp byte in this frame. ? if fin flag = 1b, it is cleared in the first frame. ? if psh flag =1b, it is cleared in the first frame. ? tcp checksum udp header ? udp length: mss + hdrlen - tucss ?udp checksum 7.3.6.2.2 tcp/ip/udp header for the subsequent frames the hardware makes the following changes to the headers of the subsequent packets that is derived from each tcp segmentation context. note: number of bytes left for transmission = paylen - (n * mss). where n is the number of frames that have been transmitted. mac header (for snap packets) ? type/len field = mss + hdrlen - 14
167 inline functions?82574 gbe controller ipv4 header ? ip identification: incremented from last value (wrap around) ? ip total length = mss + hdrlen - ipcss ?ip checksum ipv6 header ? payload length = mss + hdrlen - ipcss - ipv6size (while ipv6size = 40bytes) tcp header ? sequence number update: add previous tcp payload size to the previous sequence number value. this is equivalent to a dding the mss to the previous sequence number. ? if fin flag = 1b, it is cleared in these frames. ? if psh flag =1b, it is cleared in these frames. ? tcp checksum udp header ? udp length: mss + hdrlen - tucss ?udp checksum 7.3.6.2.3 tcp/ip/udp header for the last frame the hardware makes the following changes to the headers of the last packet that is derived from each tcp segmentation context. note: last frame payload bytes = paylen - (n * mss) mac header (for snap packets) ? type/len field = last frame payload bytes + hdrlen - 14 ipv4 header ? ip total length = (last frame payload bytes + hdrlen) - ipcss ? ip identification: incremented from last value (wrap around) ?ip checksum ipv6 header ? payload length = last frame payload bytes + hdrlen - ipcss - ipv6size (while ipv6size = 40bytes) tcp header ? sequence number update: add previous tcp payload size to the previous sequence number value. this is equivalent to a dding the mss to the previous sequence number. ? if fin flag = 1b, set it in this last frame ? if psh flag =1b, set it in this last frame ? tcp checksum
82574 gbe controller?inline functions 168 udp header ? udp length: (last frame payload bytes + hdrlen) - tucss ?udp checksum 7.4 interrupts the 82574 supports the following interrupt modes: ? pci legacy interrupts ? pci msi - message signaled interrupts ? pci msi-x - extended message signaled interrupts 7.4.1 legacy and msi interrupt modes in legacy and msi modes, an interrupt cause is reflected by setting one of the bits in the icr register, where each bit reflects one or more causes. this description of icr register provides the mapping of interrupt causes (for example, a specific rx queue event or a lsc event) to bits in the icr. mapping of causes relating to the tx and rx queues as well as non-queue causes in this mode is not configurable. each possible queue interrupt cause (such as, each rx queue, tx queue or any other interrupt source) has an entry in the icr. the following configuration and parameters are involved: ? the icr[31:0] bits are allocated to specific interrupt causes 7.4.2 msi-x mode msi-x defines a separate optional extension to basic msi functionality. compared to msi, msi-x supports a larger maximum number of vectors per function, the ability for software to control aliasing when fewer vectors are allocated than requested, plus the ability for each vector to use an independent address and data value, is specified by a table that resides in memory space. however, most of the other characteristics of msi- x are identical to those of msi. for more in formation on msi-x, re fer to the pci local bus specification, revision 3.0. in msi-x mode, an interrupt cause is mapped into an msi-x vector. this section describes the mapping of interrupt causes (for example, a specific rx queue event or a lsc event) to msi-x vectors. mapping is accomplished through the ivar register. each possible cause for an interrupt is allocated an entry in the ivar, and each entry in the ivar identifies one msi-x vector. it is possible to map multiple interrupt causes into the msi-x vector. interrupt causes that are not related to the tx and rx queues are also mapped via the ivar register. the icr also reflects interrupt causes re lated to non-queue causes. these are mapped directly into the icr (as in the legacy case), with each cause allocated a separate bit.
169 inline functions?82574 gbe controller the following configuration and parameters are involved: ? the ivar.int_alloc[4:0] entries map two tx queues, two rx queues and other events to 5 interrupt vectors ? the icr[24:20] bits reflect specific interrupt causes ? five msi-x interrupt vectors are provided (calculated based on four vectors for queues and one vector for other causes). the requested number of vectors is loaded from the msi_x_n fields in the eeprom into the pcie msi-x capability structure of the function. figure 36. cause mapping in msi-x mode 7.4.3 registers the interrupt logic consists of the register s listed in the following table, plus the registers associated with msi/msi-x signaling. interrupt cause registers (icr) this register records the interrupts causes to provide to the software information on the interrupt source. ivar . . . 0 20 24 0 interrupt causes (queues and other) msi-x vector 4 ic r 31 register acronym function interrupt cause icr records all interrupt causes - an interrupt is signaled when unmasked bits in this register are set. interrupt cause set ics enables software to set bits in the interrupt cause register. interrupt mask set/read ims sets or reads bits in the interrupt mask. interrupt mask clear imc clears bits in the interrupt mask. interrupt auto clear eiac enables bits in the icr and ims to be cleared automatically following msi-x interru pt without a read or write of the icr. interrupt auto mask iam enables bits in the ims to be set automatically.
82574 gbe controller?inline functions 170 the interrupt causes include: ? the receive and transmit related interrupts (including new per queue cause). ? other bits in this register are the lega cy indication of interrupts as the mdic complete, management and link status change. there is a specific other cause bit that is set if one of these bits are set, this bit can be mapped to a specific msi-x interrupt message. in msi-x mode the bits in this register can be configured to auto -clear when the msi-x interrupt message is sent, in order to minimize driver overhead, and when using msi-x interrupt signaling. in systems that do not support msi-x, reading the icr register clears it's bits or writing 1b's clears the corresponding bits in this register. interrupt cause set register (ics) this registers allows triggering an immediat e interrupt by software, by writing 1b to bits in ics the corresponding bits in icr is set used usually to rearm interrupts the software didn't have time to handle in the current interrupt routine. interrupt mask set and read regist er (ims) and interrupt mask clear register (imc) interrupts appear on pcie only if the interru pt cause bit is a one and the corresponding interrupt mask bit is a one. software blocks assertion of an interrupt by clearing the corresponding bit in the mask register. the cause bit stores the interrupt event regardless of the state of the mask bit. clea r and set make this register more thread safe by avoiding a read-modify-write operat ion on the mask register. the mask bit is set for each bit written to a one in the set register and cleared for each bit written in the clear register. reading the set register (ims) returns the current mask register value. in msi-x mode, ctrl_ext. pba_support shou ld also be set. for more details see section 10.2.2.5 . interrupt auto clear en able regist er (eiac) bits 24:20 in this register enables clearing of the corresponding bit in icr following interrupt generation. when a bit is set, th e corresponding bit in icr and in ims is automatically cleared following an interrupt. used in msi-x interrupt vector, this feature allows interrupt cause recognition, and selective interrupt cause and mask bits rese t, without requiring software to read the icr register, therefore, the penalty relate d to a pcie read transaction is avoided. bits in the icr that are not set in eiac need to be cleared with icr read or icr write- to-clear. interrupt auto mask en able register (iam) in non msi-x mode - each bit in this register enables setting of the corresponding bit in ims following write to-clear to icr. in msi-x mode and ctrl_ext.eiame is set, the software can set the bits of this register to select mask bits that are cleared during interrupt processing. in this mode, each bit in this register enables clearing of the corresponding bit in the mask register (im) following interrupt generation.
171 inline functions?82574 gbe controller 7.4.4 interrupt moderation the 82574 implements interrupt moderation to reduce the number of interrupts software processes. the moderation scheme is based on a timer called itr interrupt throttle register). in general terms, the itr defines an interrupt rate by defining the time interval between consecutive interrupts. the number of itr registers is: ? non msi-x mode - a single itr is used (itr). ? msi-x - a separate eitr is provided per msi-x vector (eitr[0] is allocated to msi- x[0] and its corresponding interrupts, eitr[1] is allocated to msi-x[1] and its corresponding interrupts etc.) software uses itr to limit the rate of delivery of interrupts to the host cpu. it provides a guaranteed inter-interrupt delay between interrupts asserted by the network controller, regardless of network traffic conditions. the following algorithm converts the inter-interrupt interval value to the common 'interrupts/sec' performance metric: interrupts/sec = (256 * 10 -9 sec x interval) -1 for example, if the interval is programmed to 500d, the 82574 guarantees the cpu is not interrupted by it for at least 128 ? s from the last interrupt. inversely, inter-interrupt interval value can be calculated as: inter-interrupt interval = (256 * 10 -9 sec x interrupts/sec) -1 the optimal performance setting for this re gister is very system and configuration specific. itr rules: ? the maximum observable interrupt rate from the adapter should not exceed 7813 interrupts/sec. ? the extended interrupt throttle register should default to 0x0 upon initialization and reset. each time an interrupt event happens, the corresponding bit in the icr is activated. however, an interrupt message is not sent out on the pcie* interface until the eitr counter assigned to the proper msi-x vector that supports the icr bit has counted down to zero. the eitr counter is reloaded after it has reached zero with its initial value and the process repeats again. the in terrupt flow should follow the following diagram:
82574 gbe controller?inline functions 172 figure 37. interrupt throttle flow diagram for cases where the 82574 is connected to a small number of clients, it is desirable to fire off the interrupt as soon as possible with minimum latency. for these cases, when the eitr counter counts down to zero and no interrupt event has happened, then the eitr counter is not reset but stays at zero. thus, the next interrupt event triggers an interrupt immediately. that scenario is illustrated as case b as follows. start count down v assert interrupt counter = 0 ? load counter with interval yes yes interrupt active ? yes no intr ack ? no no v clear interrupt yes
173 inline functions?82574 gbe controller case a: heavy load, interrupts moderated case b: light load, interrupts immediately on packet receive 7.4.5 clearing interrupt causes the 82574 has three methods available for to clear icr bits: auto-clear, clear-on-write, and clear-on-read. auto-clear in systems that support msi-x, the interrupt vector allows the interrupt service routine to know the interrupt cause without reading the icr. the software overhead of a i/o read or write can be avoided by setting ap propriate icr bits to autoclear mode by setting the corresponding bits in the interrupt auto-clear register (eiac). when auto-clear is enabled for an interrupt cause, the icr bit is set when a cause event occurs. when the eitr counter reaches zero, the msi-x message is sent on pcie. then the icr bit is cleared and enabled to be set by a new cause event. the vector in the msi-x message signals software the cause of the interrupt to be serviced. it is possible that in the time after the icr bit is cleared and the interrupt service routine services the cause, for example chec king the transmit and receive queues, that another cause event occurs that is then se rviced by this isr call, yet the icr bit remains set. this results in a spurious interrupt. software can detect this case if there are no entries that require service in the tr ansmit and receive queues, and exit knowing that the interrupt has been automatically cleared. the use of interrupt moderations through the eitr register limits the extra software overhead that can be caused by these spurious interrupts. pkt pkt pkt pkt pkt pkt itr delay itr delay intr intr intr pkt pkt pkt pkt itr delay intr intr
82574 gbe controller?inline functions 174 write to clear the icr register clears specific interrupt caus e bits in the register after writing 1b to those bits. any bit that was written with a 0b remains unchanged. read to clear all bits in the icr register are cleared on a read to icr. 7.5 802.1q vlan support the 82574 provides several specific mechanisms to support 802.1q vlans: ? optional adding (for transmits) and ping (for receives) of ieee 802.1q vlan tags. ? optional ability to filter packets belonging to certain 802.1q vlans. 7.5.1 802.1q vlan packet format the following diagram compares an untagged 802.3 ethernet packet with an 802.1q vlan tagged packet: note: the crc for the 802.1q tagged frame is re-c omputed, so that it covers the entire tagged frame including the 802.1q tag header. also, maximum frame size for an 802.1q vlan packet is 1522 octets as opposed to 1518 octets for a normal 802.3z ethernet packet. 7.5.1.1 802.1q tagged frames for 802.1q, the tag header field consists of four octets comprised of the tag protocol identifier (tpid) and tag control informatio n (tci); each taking two octets. the first 16 bits of the tag header makes up the tp id. it contains the protocol type, which identifies the packet as a valid 802.1q tagged packet. the two octets making up the tci contain three fields: ? user priority (up) ? canonical form indicator (cfi). should be 0b for transmits. for receives, the device has the capability to filter out pack ets that have this bit set. see the cfien and cfi bits in the rctl described in section 10.2.5.1 . ? vlan identifier (vid) 802.3 packet #octets 802.1q vlan packet #octets da 6 da 6 sa 6 sa 6 type/length 2 802.1q tag 4 data 46-1500 type/length 2 crc 4 data 46-1500 crc* 4
175 inline functions?82574 gbe controller the bit ordering is as follows: 7.5.2 transmitting and re ceiving 802.1q packets since the 802.1q tag is only four bytes, adding and stripping of tags could be done completely in software. (in other words, fo r transmits, software inserts the tag into packet data before it builds the transmit descr iptor list, and for receives, software strips the 4-byte tag from the packet data before delivering the packet to upper layer software.) however, because adding and stripping of tags in software results in more overhead for the host, the 82574 has additional capabilities to add and strip tags in hardware. see section 7.5.2.1 and section 7.5.2.2 . 7.5.2.1 adding 802.1q tags on transmits software might command the 82574 to insert an 802.1q vlan tag on a per packet basis. if ctrl.vme is set to 1b, and the vle bit in the transmit descriptor is set to 1b, then the 82574 inserts a vlan tag into the packet that it transmits over the wire. the tag protocol identifier (tpid) field of th e 802.1q tag comes from the vet register, and the tag control information (tci) of the 802. 1q tag comes from the special field of the transmit descriptor. 7.5.2.2 stripping 802.1q tags on receives software might instruct the 82574 to strip 802 .1q vlan tags from received packets. if the ctrl.vme bit is set to 1b, and the incomi ng packet is an 802.1q vlan packet (for example, it's ethernet type field matched the vet), then the 82574 strips the 4-byte vlan tag from the packet, and stores the tci in the special field of the receive descriptor. the 82574 also sets the vp bit in the receive descriptor to indicate that the packet had a vlan tag that was stripped. if the ctrl.vme bit is not set, the 802.1q packets can still be received if they pass the receive filt er, but the vlan tag is not stripped and the vp bit is not set. 7.5.3 802.1q vlan packet filtering vlan filtering is enabled by setting the rc tl.vfe bit to 1b. if enabled, hardware compares the type field of the incoming packet to a 16-bit field in the vlan ether type (vet) register. if the vlan type field in th e incoming packet matches the vet register, the packet is then compared against the vlan filter table array for acceptance. octet 1 octet 2 up cfi vid
82574 gbe controller?inline functions 176 the virtual lan id field indexes a 4096 bit vector. if the indexed bit in the vector is one; there is a virtual lan match. software might set the entire bit vector to ones if the node does not implement 802.1q filtering. th e register description of the vlan filter table array is described in detail in section 10.2.5.24 . in summary, the 4096-bit vector is comprised of 128, 32-bit registers. matching to this bit vector follows the same algorithm as indicated in section 7.1.1 for multicast address filtering. the vlan identifier (vid) field consis ts of 12 bits. the upper 7 bits of this field are decoded to determine the 32-bit register in the vlan filter table array to address and the lower 5 bits determine which of the 32 bits in the register to evaluate for matching. two other bits in the receive control register (see section 10.2.5.1 ), cfien and cfi, are also used in conjunction with 802.1q vlan filtering operations. cfien enables the comparison of the value of the cfi bit in the 802.1q packet to the receive control register cfi bit as acceptance criteria for the packet. note: the vfe bit does not effect whether the vlan ta g is stripped. it only affects whether the vlan packet passes the receive filter. ta b l e 4 2 lists reception actions per control bit settings. table 42. packet reception decision table note: a packet is defined as a vlan/802.1q packet if its type field matches the vet. 7.6 led's the 82574 implements three output drivers in tended for driving external led circuits per port. each of the three led outputs can be individually configured to select the particular event, state, or activity, which is indicated on that output. in addition, each led can be individually configured for output polarity as well as for blinking versus non- blinking (steady-state) indication. the configuration for led outputs is specified via the ledctl register. furthermore, the hardware-default configuration for all the le d outputs, can be specified via nvm fields, thereby supporting led displays configurable to a particular oem preference. is packet 802.1q? ctrl. vme rctl. vfe action no x x normal packet reception. yes 0b 0b receive a vlan packet if it passes the standard filters (only). leave the packet as received in the data buffer. vp bit in receive descriptor is cleared. yes 0b 1b receive a vlan packet if it passes the standard filters and the vlan filter table. leave the packet as received in the data buffer (for example, the vlan tag would not be stripped). vp bit in receive descriptor is cleared. yes 1b 0b receive a vlan packet if it passes the standard filters (only). strip off the vlan information (four bytes) from the incoming packet and store in the descriptor. sets the vp bit in receive descriptor. yes 1b 1b receive a vlan packet if it passes the standard filters and the vlan filter table. strip off the vlan information (four bytes) from the incoming packet and store in the descriptor. sets the vp bit in receive descriptor.
177 inline functions?82574 gbe controller each of the three led's might be configured to use one of a variety of sources for output indication. the mode bits control the led source: ? link_100/1000 is asserted when link is established at either 100 or 1000 mb/s. ? link_10/1000 is asserted when link is established at either 10 or 1000 mb/s. ? link_up is asserted when any speed link is established and maintained. ? activity is asserted when link is established and packets are being transmitted or received. ? link/activity is asserted when link is established and there is no transmit or receive activity ? link_10 is asserted when a 10 mb/s link is established and maintained. ? link_100 is asserted when a 100 mb/s link is established and maintained. ? link_1000 is asserted when a 1000 mb /s link is established and maintained. ? full_duplex is asserted when the link is configured for full duplex operation. ? collision is asserted when a collision is observed. ? paused is asserted when the device's transmitter is flow controlled. ? led_on is always asserted; led_off is always de-asserted. the ivrt bits enable the led source to be invert ed before being output or observed by the blink-control logic. led outputs are assumed to normally be connected to the negative side (cathode) of an external led. the blink bits control whether the led shou ld be blinked while the led source is asserted, and the blinking freq uency (either 200 ms on and 200 ms off or 83 ms on and 83 ms off) 1 . the blink control can be especially useful for ensuring that certain events, such as activity indication, cause led transitions, which are sufficiently visible to a human eye. the same blinking rate is shared by all leds. note: note that the link/activity source function s slightly different from the others when blink is enabled. the led is off if there is no link, on if there is link and no activity, and blinking if there is link and activity. 7.7 time sync (ieee1588 and 802.1as) 7.7.1 overview measurement and control applications are increasingly using distributed system technologies such as network communication, local computing, and distributed objects. many of these applications are enhanced by having an accurate system wide sense of time achieved by having local clocks in each sensor, actuator, or other system device. without a standardized protocol for synchron izing these clocks, it is unlikely that the benefits are realized in the multi-vendor system component market. existing protocols for clock synchronization are not optimum for these applications. for example, network time protocol (ntp) targets large distributed computing systems with ms synchronization requirements. 1. while in smart power down mode, the blinking durations are increased by 5x to 1 second and 415 ms, respectively.
82574 gbe controller?inline functions 178 the 1588 standard specifically addresses the needs of measurement and control systems: ? spatially localized ? ? s to sub- ? s accuracy ? administration free ? accessible for both high-end devices and low-cost, low-end devices the time sync mechanism activation is possi ble in full-duplex mode and with extended descriptors only. no limitations on the wire speed although the wire speed might affect the accuracy. 7.7.2 flow and hardware/so ftware responsibilities the operation of a precision time protocol (ptp) enabled network is divided into two stages, initialization and time synchronization. at the initialization stage every master enabled node starts by sending sync packets that include the clock parameters of its clock. upon receipt of a sync packet a node compares the received clock parameters to its own and if the received parameters are better, then this node moves to slave state and stops sending sync packets. when in slave state the node continuously compares the incoming packet to its currently chosen master and if the new clock parameters are better then the master selection is transferred to this master clock. eventually the best master clock is chosen. every node has a defined time-out interval in which if no sync packet was received from its chosen master clock it moves back to master state and starts sending sync packets until a new best master clock (bmc) is chosen. the time synchronization stage is different to master and slave nodes. if a node is at master state it should periodically send a sync packet which is time stamped by hardware on the tx path (as close as possible to the phy). after the sync packet a follow_up packet is sent that includes the value of the timestamp kept from the sync packet. in addition the master should timestamp delay_req packets on its rx path and return to the slave that sent it the timestamp value using a delay_response packet. a node in slave state should timestamp every in coming sync packet and if it came from its selected master, software uses this value for time offset calculation. in addition it should periodically send delay_req packets in order to calculated the path delay from its master. every sent delay_req packet sent by the slave is time stamped and kept. with the value received from the master with delay_response packet the slave can now calculate the path delay from the master to the slave. the synchronization protocol flow and the offset calculation are shown in figure 38 .
179 inline functions?82574 gbe controller figure 38. sync flow and offset calculation the hardware responsibilities are: 1. identify the packets that require time stamping. 2. timestamp the packets on both rx and tx paths. 3. store the time stamp value for software. 4. keep the system time in hardware and give a time adjustment service to the software. the software is responsible on: 1. bmc protocol execution which means defining the node state (master or slave) and selection of the master clock if in slave state. 2. generate ptp packets, consume ptp packets. 3. calculate the time offset and adjust the system time using hardware mechanism for that. s y n c f o l l o w _ u p ( t 1 ) d e l a y _ r e s p o n s e ( t 4 ) d e l y _ r e q master slave timestamp t1 t2 t3 t4 timestamp timestamp timestamp toffset = [(t2-t1)-(t3-t4)]/2
82574 gbe controller?inline functions 180 table 43. chronological order of events for sync and path delay 7.7.2.1 timesync indications in rx and tx packet descriptors some indications need to be transferred between software and hardware regarding ptp packets. on the tx path the software should set the tst bit in the extcmd field in the tx advanced descriptor. on the rx path, hardware has two indications to transfer to software, one is to indicate that this packet is a ptp packet (no matter if timestamp taken or not) this is also for other types of ptp packets needed for manageme nt of the protocol this bit is set only for the l2 type of packets (the ptp packet is identified according to its ethertype). ptp packets have the packettype field set to 0xe to indicate that the etype matches the filter number set by software to filter ptp packets. the udp type of ptp packets don?t need such indication since the port number (319 for event and 320 all the rest ptp packets) directs the packets toward the time sync application. the second indication is the tst bit in the extended status field of the rx descriptor this bit indicates to the software that time stamp was taken for this packet. software needs to access the time stamp registers to get the timestamp values. 7.7.3 hardware time sync elements all time sync hardware elements are reset to their initial values as defined in the registers section upon mac reset. action responsibility node role generate a sync packet with timestamp notification in descriptor. sw master timestamp the packet and store the value in registers (t1). hw master timestamp incoming sync packet, store the value in register and store the sourceid and sequenceid in registers (t2). hw slave read the timestamp from register put in a follow_up packet and send. sw master once got the follow_up store t2 from registers and t1 from follow_up packet. sw slave generate a delay_req packet with timestamp notification in descriptor sw slave timestamp the packet and store the value in registers (t3). hw slave timestamp incoming delay_req packet, store the value in register and store the sourceid and sequenceid in registers (t4). hw master read the timestamp from register and send back to slave using a delay_response packet. sw master once got the delay_response packet calculate offset using t1, t2, t3 and t4 values. sw slave
181 inline functions?82574 gbe controller 7.7.3.1 system time structure and mode of operation the time sync logic contains an up counter to maintain the system time value. this is a 64-bit counter that is built of the systiml and systimh registers. when in master state, the systimh and systiml registers should be set once by the software according to the general system, when in slave state software should update the system time on every sync event as described in section 7.7.3.3 . setting the system time is done by direct write to the systimh register and fine tune setting of the systiml register using the adjustment mechanism described in section 7.7.3.3 . read access to the systimh and systiml registers should be executed in the following manner: 1. software reads register systiml, at this stage the hardware should latch the value of systimh. 2. software reads register systimh the latc hed (from last read from systiml) value should be returned by hw. upon increment event the system time valu e should increment its value by the value stored in timinca. incvalue . increment event happens every timinca. incperiod cycles if its one then increment event shou ld occur on every clock cycle. the incvalue defines the granularity in which the time is repr esented by the systmh/l registers. for example, if the cycle time is 16 ns and the incperiod is one then if the incvalue is 16 then the time is represented in nanoseconds if the incvalue is 160 then the time is represented in 0.1 ns units and so on. the incperiod helps to avoid inaccuracy in cases where the t value cannot be represented as a simple integer and should be multiplied to get to an integer representation. the incperiod value should be as small as possible to achieve best accuracy possible. for more details please refer to section 10.2.9.13 and the following ones. note: system time registers should be implemented on a free running clock to make sure the system time is kept valid on traffic idle times (dynamic clock gating). 7.7.3.2 time stamping mechanism the time stamping logic is located on tx and rx paths at a location as close as possible to the phy. this is to reduce delay uncertainties originating from implementation differences. the operation of this logic is slightly different on tx and on rx. the tx part decides to timestamp a packet if the tx timestamp is enabled and the time stamp bit in the packet descriptor is set. on the tx side only the time is captured.
82574 gbe controller?inline functions 182 on the rx this logic parses the traversing frame and if rx timestamp is enabled and it matches the ethertype, udp port (if needed), version and message type as defined in the register described in section 10.2.9.7 the time, sourceid and sequenceid are latched in the timestamp registers. in addition two indications in the rx descriptor are added, one to identify that this is a ptp pack et (done with packet type, this is only for l2 packets since on the udp packets the port number directs the packet to the application) and the second (ts) to identify that a time stamp was taken for this packet. if a ptp packet is received but does not match time stamping criteria (not an event packet) or for some reason time stamp was not taken only the first indication is added. for more details please refer to the time stamp registers sections ( section 10.2.9.8 or section 10.2.9.1 ). the following figure defines the exact point where the time value should be captured. on both sides the time stamp values are locked in the registers until software access. this means that if a new ptp packet that requires time stamp has arrived before software accessed the previous ptp packet, the new ptp packet is not time stamped. in some cases on the rx path a packet that was time stamped might be lost and not get to the host, to avoid lock condition the software should keep a watch dog timer to clear locking of the time stamp register. the value of such timer should be at least higher then the expected interval between two sync or delay_req packets (depends on master or slave). figure 39. time stamp point 7.7.3.3 time adjustment mode of operation node in time sync network can be in one of two states master or slave. when a time sync entity is at master state it should synchronize other entities to its system clock. in this case no time adjustments are needed. when the entity is in slave state it should adjust its system clock by using the data arrived with the follow_up and delay_response packets and to the time stamp values of sync and delay_req packets. when having all the values, software on the slave entity can adjust its offset in the following manner.
183 inline functions?82574 gbe controller after offset calculation the system time regi ster should be updated. this is done by writing the calculated offset to timadjl and timadjh registers. the order should be as follows: 1. write the lower portion of the offset to timadjl. 2. write the high portion of the offset to ti madjh to the lower 31 bits and the sign to the most significant bit. after the write cycle to timadjh the value of timadjh and timadjl should be added to the system time. 7.7.4 ptp packet structure the time sync implementation supports both the 1588 v1 and v2 ptp frame formats. the v1 structure can come only as udp payload over ipv4 while the v2 can come over l2 with its ethertype or as a udp payload over ipv4 or ipv6.the 802.1as uses only the layer 2 v2 format. offset in bytes v1 fields v2 fields bits 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 0 versionptp transportspecific 1 messageid 1 reserved versionptp 2 versionnetwork messagelength 3 4 subdomain subdomainnumber 5 reserved 6 flags 7 8 correctionns 9 10 11 12 13 14 correctionsubns 15 16 reserved 17 18 19 20 messagetype reserved 21 source communication technology source communication technology
82574 gbe controller?inline functions 184 table 44. v1 and v2 ptp message structure note: only the fields with the bold italic format colored red are of interest to the hardware. table 45. ptp message over layer 2 table 46. ptp message over layer 4 when a ptp packet is recognized (by ethertype or udp port address) on the rx side, the version should be checked. if it is v1, then the control field at offset 32 should be compared to control field in register described at section 10.2.9.7 . otherwise the byte at offset 0 (messageid) should be used for comparison to messageid field. the rest of the needed fields are at the same location and size for both v1 and v2 versions. table 47. message decoding for v1 (control field at offset 32) 22 sourceuuid sourceuuid 23 24 25 26 27 28 sourceportid sourceportid 29 30 sequenceid sequenceid 31 32 control control 33 reserved logmessageperiod 34 flags n/a 35 1. should be all zero. ethernet (l2) vlan (optional) ptp ethertype ptp message ethernet (l2) ip (l3) udp ptp message offset in bytes v1 fields v2 fields bits 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 enumeration value ptp_sync_message 0 ptp_delay_req_message 1 ptp_followup_message 2 ptp_delay_resp_message 3 ptp_management_message 4 reserved 5?255
185 inline functions?82574 gbe controller table 48. message decoding for v2 (messageid field at offset 0) if v2 mode is configured in section 10.2.9.8 then timestamp should be taken on ptp_path_delay_req_message and ptp_path_delay_resp_message for any value in the message field in register described at section 10.2.9.7 . messageid message type value (hex) ptp_sync_message event 0 ptp_delay_req_message event 1 ptp_path_delay_req_message event 2 ptp_path_delay_resp_message event 3 unused 4-7 ptp_followup_message general 8 ptp_delay_resp_message general 9 ptp_path_delay_followup_message general a ptp_announce_message general b ptp_signalling_message general c ptp_management_message general d unused e-f
82574 gbe controller?sy stem manageability 186 8.0 system manageability network management is an increasingly im portant requirement in today's networked computer environment. software-based manage ment applications provide the ability to administer systems while the operating system is functioning in a normal power state (not in a pre-boot state or powered-down state). the intel? system management bus (smbus) interface and the network controller - sideband interface (nc-si) for the 82574 fills the management void that exists when the operating system is not running or fully functional. this is accomplished by providing a mechan ism by which manageability network traffic can be routed to and from a management controller (mc). the 82574 provides two different and mutually exclusiv e bus interfaces for manageabilit y traffic. the first is the intel? proprietary smbus interface; several generations of intel? ethernet controllers have provided this same interface that operates at speeds of up to 400 khz. the second interface is nc-si, which is a new industry standard interface created by the dmtf specifically for routing manageab ility traffic to and from a mc. the nc-si interface operates at 100 mb/s full-duplex speeds. 8.1 scope this section describes the supported management interfaces and hardware configurations for platform system manage ment. it describes the interfaces to an external mc, the partitioning of platfo rm manageability among system components, and the functionality provided by the 82574 in each of the platform configurations. 8.2 pass-through (pt) functionality pass-through (pt) is the term used when referring to the process of sending and receiving ethernet traffic over the sideband interface. the 82574 has the ability to route ethernet traffic to the host operating system as well as the ability to send ethernet traffic over the sideband interface to an external mc.
187 system manageability? 82574 gbe controller figure 40. sideband interface the sideband interface provides a mechanism by which the 82574 can be shared between the host and the mc. by providin g this sideband interface, the mc can communicate with the lan without requiring a dedicated ethernet controller to do so. the 82574 supports two sideband interfaces: ?smbus ?nc-si the usable bandwidth for either direction is up to 400 kb/s when using the smbus interface and 100 mb/s for the nc-si interface. note that only one mode of sideband can be active at any given time. this configuration is done via an nvm setting (see section 6.0 for more details). 8.3 components of a sideband interface there are two components to a sideband interface: ? physical layer - the electrical layer that transfers data ? logical layer - the agreed upon protocol that is used for communications the mc and the 82574 must be in alignm ent for both of these components. for example, the nc-si physical interface is ba sed on the rmii interface. however, there are some differences at the physical level (d etailed in the nc-si specification) and the protocol layer is completely different. 8.4 smbus pass-through interface smbus is the system management bus defined by intel? corporation in 1995. it is used in personal computers and servers for low-speed system management communications. the smbus interface is one of two pass-through interfaces available in the 82574. mc 82574 host host interface sideband interface port 0 lan interface
82574 gbe controller?sy stem manageability 188 this section describes how the smbus interf ace in the 82574 operates in pass-through mode. 8.4.1 general the smbus sideband interface includes the standard smbus commands used for assigning a slave address and gathering device information as well as intel? proprietary commands used specifica lly for the pass-through interface. 8.4.2 pass-throug h capabilities this section details the specific manageab ility capabilities the 82574 provides while in smbus mode. the pass-through traffic is carried by the sideband interface as described in section 8.2 . note: these services are not available in nc-si mode. 8.4.2.1 packet filtering since the host operating system and the mc both use the 82574 to send and receive ethernet traffic, there needs to be a mechanism by which incoming ethernet packets can be identified as those that should be se nt to the mc rather than the host operating system. in order to determine the types of traffic that is forwarded to the mc over the sideband interface, the 82574 supports a manageab ility receive filtering mechanism. this mechanism is used to determine if a received packet should be forwarded to the mc or to the host. following is a list of the filtering capabilitie s available for the smbus interface with the 82574: ? rmcp/rmcp+ ports ? flexible udp/tcp port filters ? 128-byte flexible filters ?vlan ?ipv4 address ?ipv6 address ? mac address filters each of these are discussed in detail later in this section. 8.4.3 manageability receive filtering this section describes the manageability receive packet filtering flow when using the smbus pass-though interface. the description applies to the capability ofthe 82574?s lan port. a packet that is received bythe 82574 can be discarded, sent to host memory, sent to the external mc or to both the external mc and host memory. there are two modes of rece ive manageability filtering: 1. receive all ? all received packets are routed to the mc in this mode. it is enabled by setting the rcv_tco_en bit (which enables packets to be routed to the mc) and rcv_all bit (which routes all packets to the mc) in the management control (manc) register.
189 system manageability? 82574 gbe controller 2. receive filtering ? in this mode only certain types of packets are directed to the manageability block. the mc should set the rcv_tco_en bit together with the specific packet type bits in the manageability filtering registers. note: the rcv_all bit must be cleared if filtering is enabled. in default mode, every packet that is directed to the mc, is not directed to host memory. the mc can also configure the 82574 to direct certain manageability packets to host memory by setting the en_mng2host bit in the manc register. it then needs to configure the 82574 to send manageability packets to the host (according to their type) by setting the corresponding bits in the manc2h register. an example of packets that might be nece ssary to send to both the mc and host operating system might be arp requests. if the mc configures the manageability filters to send arp requests to the mc; however, does not also configure the settings to also send them to the host, then the host oper ating system never receives arp requests. the mc controls the types of packets that it receives by programming the receive manageability filters. following is the list of filters that are accessible to the mc: table 49. available filters all filters are reset only on internal power on reset. register filters that enable filters or functionality are also reset by firmware. these registers can be loaded from the nvm following a reset. filters functionality when reset? filters enable general configuration of the manageability filters internal power on reset and firmware reset manageability to host enables routing of manageability packets to host internal power on reset and firmware reset manageability decision filters [6:0] configuration of ma nageability decision filters internal power on reset and firmware reset mac address [3:0] four unicast mac manageability addresses internal power on reset vlan filters [7:0] eight vlan tag values internal power on reset udp/tcp port filters [15:0] 16 destination port values internal power on reset flexible 128 bytes tco filters [3:0] length values for four flex tco filters internal power on reset ipv4 and ipv6 address filters [3:0] ip address fo r manageability filtering internal power on reset
82574 gbe controller?sy stem manageability 190 the high-level structure of manageability filtering is done using two steps: 1. packets are filtered by l2 criteria (mac address and unicast/multicast/broadcast). 2. packets are filtered by the manageability filters (port, ip, flex, etc.). some general rules apply: ? fragmented packets are passed to manage ability but not parsed beyond the ip header. ? packets with l2 errors (crc, alignment, etc.) are never forwarded to manageability, unless the rctl.sbp bit is set and there is a packet size error (greater than 1522 or shorter than 64 bytes). note: the mfval register can enable manageability ma c, vlan and ip filtering. these filters also have enable bits in other registers (mac address with rah[15].av, vlan filtering with mavtv[3:0].en, ipv4 filtering with ip av.ip40 and ipv6 filtering with ipav.ip60). any of these filters are enabled if one of the enable bits is set to 1b. note: if the manageability unit uses a dedicated mac address/vlan tag, it should take care not to use l3/l4 decision filtering on top of it. otherwise all the packets with the manageability mac address/vlan tag filtered out at l3/l4 are forwarded to the host. the following sections describe each of these stages in detail. 8.4.3.1 l2 layer filtering figure 41 shows the manageability l2 filtering. a packet passes successfully through l2 filtering if any of the following conditions are met: 1. it is a unicast packet and promiscuous unicast filtering is enabled. 2. it is a unicast packet and it matches one of the unicast mac filters (host or manageability). 3. it is a multicast packet and promiscuous multicast filtering is enabled. 4. it is a multicast packet and it matches one of the multicast filters. 5. it is a broadcast packet. note: in case of a broadcast packet, the packet does not go through vlan filtering (such as, vlan filtering is assumed to match). promiscuous unicast mode - promiscuous unicast mode can be set/cleared only by the software device driver (not by the mc), and it is usually used when the lan device is used as a sniffer. promiscuous multicast mode - promiscuous multicast is used in lan devices that are used as a sniffer, and is controlled only by the software device driver. this bit can also be used by a mc requiring forwarding of all multicasts. unicast filtering - the entire mac address is checked against the 16 unicast addresses. the 15 host unicast addresses are controlled by the software device driver (the mc must not change them). the last unicast address (address 16) is dedicated to management functions and is only accessed by the mc. the mc configures manageability unicast filtering via the rah[15] and ral[15] registers and enables them in the mfval register.
191 system manageability? 82574 gbe controller multicast filtering - only 12 bits out of the packet's destination mac address are compared against the multicast entries. thes e entries can be configured only by the software device driver and cannot be controlled by the mc. figure 41. l2 packet filtering (receive) unicast packet & promiscous unicast en unicast filter pass start broadcast packet multicast packet promiscous multicast enable multicast filter pass no drop packet mng filtering ) 2 ( yes yes no no no yes no yes yes
82574 gbe controller?sy stem manageability 192 8.4.3.2 manageability filtering the manageability filtering stage combines so me of the checks done at the previous stages with additional l3/l4 checks into a final decision whether to route a packet to the mc. the following sections describe the manageability filtering done at layers l3 and l4, followed by the final filtering rules. figure 42. manageability filtering (receive) 8.4.3.3 l3 and l4 filters arp filtering - the 82574 supports filtering of both arp request packets (initiated externally) and arp responses (to requests initiated by the mc or host). neighbor discovery filtering - the 82574 su pports filtering of neighbor solicitation packets (type 135). neighbor solicitation uses the ipv6 destination address filters defined in the ip6at registers (all enabled ipv6 addresses are matched for neighbor solicitation). rcv_en rcv_all pass mdef0? pass mdef7? broadcast packet packet to host bam=1 no yes no no yes no no drop packet no mng2 host packet to mng yes yes yes yes pass vlan filter yes no mng filtering (2) this section is part of the general receive filtering yes start
193 system manageability? 82574 gbe controller port 0x298/0x26f filtering - the 82574 suppo rts filtering by fixed destination port numbers, port 0x26f and port 0x298. flex port filtering - the 82574 implements fo ur flex destination port filters. the 82574 directs packets whose l4 destination port matches the value of the respective word in the mfutp registers. the mc must insure th at only valid entries are enabled in the decision filters. flex tco filters - the 82574 provides two flex tco filters. each filter looks for a pattern match within the 1st 128 bytes of the packet. the mc then configures the pattern to match into the ftft table. the mc must ensu re that only valid entries are enabled in the decision filters. note: the flex filters are temporarily disabled when read from or written to by the host. any packet received during a read or write oper ation is dropped. filter operation resumes once the read or write access completes. ip address filtering - the 82574 supports f iltering by ip address using ipv4 and ipv6 address filters, dedicated to manageability. checksum filter - if bit manc.en_xsum_filte r is set, the 82574 directs packets to the mc only if they pass l3/l4 checksum (if they exist), in addition to matching other filters previously described. 8.4.3.4 manageability decision filters the manageability decision filters are a set of eight filters (mdef0 ?mdef7), each with the same structure. the filtering rule for each decision filter is programmed by the mc and defines which of the l2, vlan, and manageability filters participate in the decision. any packet that passes at least one rule is directed to manageability and possibly to the host. possible filtering criteria are: ? packet passed a valid management l2 unicast address filter. ? packet is a broadcast packet. ? packet has a vlan header and it pa ssed a valid manageability vlan filter. ? packet matched one of the valid ipv4 or ipv6 manageability address filters. ? packet is a multicast packet. ? packet passed arp filtering (request or response). ? packet passed neighbor solicitation filtering. ? packet passed 0x298/0x26f port filter. ? packet passed a valid flex port filter. ? packet passed a valid flex tco filter. the structure of each of the decision filters is shown in figure 43 . a boxed number indicates that the input is co nditioned on a mask bit defined in the mdef register for this rule. the decision filter rules are as follows: ? at least one bit must be set in a register. if all bits are cleared (mdef = 0x0000), then the decision filter is disabled and ignored. ? all enabled and filters must match for the decision filter to match. an and filter not enabled in the register is ignored.
82574 gbe controller?sy stem manageability 194 ? if no or filter is enabled in the register, the or filters are ignored in the decision (the filter might still match). ? if one or more or filter is enabled in th e register, then at least one of the enabled or filters must match for the decision filter to match. . figure 43. manageability decision filter a decision filter defines the filtering rules. the mc programs a 32-bit register per rule (mdef[7:0]) with the settings listed in section 10.2.8.11 . a set bit enables its corresponding filter to particip ate in the filtering decision. manageability l2 unicast address 0 1 broadcast 2 vlan 3 ip address manageability l2 unicast address 4 5 broadcast 6 multicast 7 arp request 10 port 0x298 11 port 0x26f 12 flex port 0 15 flex port 3 28 flex tco 0 29 flex tco 1 9 neighbor discovery 8 arp response
195 system manageability? 82574 gbe controller table 50. assignment of decision filters bits in default mode, packets that are directed to the mc are not directed to host memory. the mc can also configure the 82574 to direct certain manageability packets to host memory by setting the en_mng2host bit in the manc register and then configuring the 82574 to send manageability packets to the host, according to their type, by setting the corresponding bits in the manc2h register (one bit per each of the eight decision rules). all manageability filters are controlled by the mc only and not by the lan device driver. the mng2host register has the following structure: filter and/or input mask bits in mdef[7:0] l2 unicast address and 0 broadcast and 1 manageability vlan and 2 ip address and 3 l2 unicast address or 4 broadcast or 5 multicast and 6 arp request 1 1. ip address checking on arp packets is controlled by manc.dis_ip_addr_for_arp. or 7 arp response 1 or 8 neighbor solicitation or 9 port 0x298 or 10 port 0x26f or 11 flex port 3:0 or 15:12 reserved -- 27:16 flex tco 1:0 or 29:28 reserved -- 31:30
82574 gbe controller?sy stem manageability 196 table 51. manage 2 host the mc enables these filters by issuing the update management receive filter parameters command (see section 8.8.1.6 ) with the parameter of 0x60. 8.4.4 smbus transactions this section gives a brief overview of the smbus protocol. following is an example for a format of a typical smbus transaction: the top row of the table identifies the bit length of the field in a decimal bit count. the middle row (bordered) identifies the name of the fields used in the transaction. the last row appears only with some transactions, and lists the value expected for the corresponding field. this value can be either hexadecimal or binary. the shaded fields are fields that are driv en by the slave of the transaction. the un- shaded fields are fields that are driven by the master of the transaction. the smbus controller is a master for some transactions and a slave for others. the differences are identified in this document. shorthand field names are listed in ta b l e 5 2 and are fully defined in the smbus specification: bits description default 0 decision filter 0 determines if packets that have passed decision filter 0 are also forwarded to the host operating system. 1 decision filter 1 determines if packets that have passed decision filter 1 are also forwarded to the host operating system. 2 decision filter 2 determines if packets that have passed decision filter 2 are also forwarded to the host operating system. 3 decision filter 3 determines if packets that have passed decision filter 3 are also forwarded to the host operating system. 4 decision filter 4 determines if packets that have passed decision filter 4 are also forwarded to the host operating system. 5 unicast and mixed determines if broadcast packets are also forwarded to the host operating system. 6 global multicast determines if unicast packets ar e also forwarded to the host operating system. 7 broadcast determines if multicast packets are also forwarded to the host operating system. 171181811 sslave address wra command a pec ap 1100 001 0 0 0000 0010 0 [data dependent] 0
197 system manageability? 82574 gbe controller table 52. shorthand field name 8.4.4.1 smbus addressing the smbus addresses (enabled from the nv m) can be re-assigned using the smbus arp protocol. in addition to the smbus address values, a ll parameters of the smbus (smbus channel selection, address mode, and address enable) can be set only through nvm configuration. note that the nvm is read at the 82574?s power up and resets. all smbus addresses should be in network byte order (nbo); msb first. 8.4.4.2 smbus arp functionality the 82574 supports the smbus arp protocol as defined in the smbus 2.0 specification. the 82574 is a persistent slave address device so its smbus address is valid after power-up and loaded from the nvm. the 82574 supports all smbus arp commands defined in the smbus specification both general and directed. note: the smbus arp capability can be disabled through the nvm. 8.4.4.3 smbus arp flow smbus arp flow is based on the status of two flags: ? av (address valid): this flag is set when the 82574 has a valid smbus address. ? ar (address resolved): this flag is set when the 82574 smbus address is resolved (smbus address was assigned by the smbus arp process). note: these flags are internal 82574 flags and ar e not exposed to external smbus devices. since the 82574 is a persistent smbus address (psa) device, the av flag is always set, while the ar flag is cleared after power up until the smbus arp process completes. since av is always set, the 82574 always has a valid smbus address. when the smbus master needs to start an sm bus arp process, it resets (in terms of arp functionality) all devices on the smbus by issuing either prepare to arp or reset device commands. when the 82574 accepts on e of these commands, it clears its ar flag (if set from previous smbus arp proce ss), but not its av flag (the current smbus address remains valid until the end of the smbus arp process). field name definition s smbus start symbol psmbus stop symbol pec packet error code a ack (acknowledge) nnack (not acknowledge) rd read operation (read value = 1b) wr write operation (write value = 0b)
82574 gbe controller?sy stem manageability 198 clearing the ar flag means that the 82574 responds to the following smbus arp transactions that are issued by the master. the smbus master issues a get udid command (general or directed) to identify the devices on the smbus. the 82574 always responds to the directed command and to th e general command only if its ar flag is not set. after the get udid, the master assigns the 82574 smbus address by issuing an assign address command. the 82574 checks whether the udid matches its own udid and if it matches, it switches its sm bus address to the address assigned by the command (byte 17). after accepting the assi gn address command, the ar flag is set and from this point (as long as the ar flag is set), the 82574 does not respond to the get udid general command. note that all other commands are processed even if the ar flag is set. the 82574 stores the smbus address that was assigned in the smbus arp process in the nvm, so at the next po wer up, it returns to its assigned smbus address. smbus arp flow shows the 82574 smbus arp flow. figure 44. smbus arp flow power-up reset set av flag; clear ar flag load smb address from eprom smb packet received no smb arp address match yes prepare to arp ? ack the comamd and clear ar flag yes yes reset device ack the comamd and clear ar flag yes no assign address command no ud id m atch nack packet ack packet set slave address set ar flag. yes no yes yes ar flag set return udid no no illegal command handling no process regular command no nack packet yes yes return udid no g et ud id command general g et ud id command directed
199 system manageability? 82574 gbe controller 8.4.4.4 smbus arp udid content the udid provides a mechanism to isolate each device for the purpose of address assignment. each device has a unique identifier. the 128-bit number is comprised of the following fields: where: device capabilities: dynamic and persistent address, pec support bit: version/revision: udid version 1, silicon revision: 1 byte 1 byte 2 bytes 2 bytes 2 bytes 2 bytes 2 bytes 4 bytes device capabilities version/ revision vendor id device id interface subsystem vendor id subsystem device id vendor specific id see notes that follow see notes that follow 0x8086 0x10aa 0x0004 0x0000 0x0000 see notes that follow msb lsb vendor id: the device manufacturer?s id as assigned by the sbs implementers? forum or the pci sig. constant value: 0x8086 device id: the device id as assigned by the device manufacturer (identified by the vendor id field). constant value: 0x10aa interface: identifies the protocol layer interfaces supported over the smbus connection by the device. in this case, smbus version 2.0 constant value: 0x0004 subsystem fields: these fields are not supported and return zeros. 765432 1 0 address type reserved (0) reserved (0) reserved (0) reserved (0) reserved (0) pec supported 0b 1b 0b 0b 0b 0b 0b 0b msb lsb 7 654 3 2 1 0 reserved (0) reserved (0) udid version silicon revision id 0b 0b 001b see the following table msb lsb
82574 gbe controller?sy stem manageability 200 silicon revision id: vendor specific id: four lsb bytes of the device ethernet mac address. the device ethernet address is taken from the nvm. 8.4.4.5 concurrent smbus transactions concurrent smbus transactions (receive, transmit and configuration read/write) are allowed without limitation. transmit fragments can be sent between receive fragments and configuration read/write commands can also issue between receive and transmit fragments. 8.4.5 smbus notification methods the 82574 supports three methods of notify ing the mc that it has information that needs to be read by the mc: ?smbus alert ? asynchronous notify ? direct receive the notification method that is used by the 82574 can be configured from the smbus using the receive enable command. this default method is set by the nvm in the pass- through init field. the following events cause the 82574 to send a notification event to the mc: ? receiving a lan packet that is designated to the mc. ? receiving a request status command from the mc initiates a status response. ? status change has occurred and the 82574 is configured to notify the external mc at one of the status changes. ? change in any in the status data 1 bits of the read status command. there can be cases where the mc is hung and therefore not responding to the smbus notification. the 82574 has a time-out value (defined in the nvm) to avoid hanging while waiting for the notification response. if the mc does not respond until the time out expires, the notification is de-asserted and all pending data is silently discarded. note that the smbus notification time-out value can only be set in the nvm, the mc cannot modify this value. silicon version revision id a0 000b a1 001b 1 byte 1 byte 1 byte 1 byte mac address, byte 3 mac address, byte 2 mac address, byte 1 mac address, byte 0 msb lsb
201 system manageability? 82574 gbe controller 8.4.5.1 smbus alert and alert response method the smbus alert# (smbalert_n) signal is an additional smbus signal that acts as an asynchronous interrupt signal to an external smbus master. the 82574 asserts this signal each time it has a message that it needs the mc to read and if the chosen notification method is the smbus alert method. note that the smbus alert method is an open-drain signal which means that other devices besides the 82574 can be connected on the same alert pin. as a result, the mc needs a mechanism to distinguish between the alert sources. the mc can respond to the alert, by issuin g an ara cycle command, to detect the alert source device. the 82574 responds to the ara cycle with its own smbus slave address (if it was the smbus alert source) and de-asserts the alert when the ara cycle is completes. following the ara cycle, the mc issues a read command to retrieve the 82574 message. some mcs do not implement the ara cycle transaction. these mcs respond to an alert by issuing a read command to the 82574 (0xc0/0xd0 or 0xde). the 82574 always responds to a read command, even if it is not the source of the notification. the default response is a status transaction. if the 82574 is the source of the smbus alert, it replies the read transaction and then de-a sserts the alert after the command byte of the read transaction. the ara cycle is an smbus receive byte transaction to smbus address 0001-100b. note that the ara transaction does not support pec. the ara transaction format is as follows: figure 45. smbus ara cycle format 8.4.5.2 asynchronous notify method when configured using the asynchronous notify method, the 82574 acts as a smbus master and notifies the mc by issuing a modified form of the write word transaction. the asynchronous notify transaction smbus address and data payload is configured using the receive enable command or using the nvm defaults. note that the asynchronous notify is not protected by a pec byte. 1 7 1 1 8 111 salert response addressrda slave device address a p 0001 100 1 0 manageability slave smbus address 01 1711711 s target address wr a sending device address a. . . mc slave address 0 0 mng slave smbus address 0 0
82574 gbe controller?sy stem manageability 202 figure 46. asynchronous notify command format the target address and data byte low/high is taken from the receive enable command or nvm configuration. 8.4.5.3 direct receive method if configured, the 82574 has the capability to send a message it needs to transfer to the external mc as a master over the smbus instead of alerting the mc and waiting for it to read the message. the message format follows. note that the command that is used is the same command that is used by the external mc in the block read command. the opcode that the 82574 puts in the data is also the same as it put in the block read command of the same functionality. the rules for the f and l flags (bits) are also the same as in the block read command. figure 47. direct receive transaction format 81 8 11 data byte low a data byte high a p interface 0 alert value 0 171111 6 1 s target address wr a f l command a . . . mc slave address 0 0 first flag last flag receive tco command 01 0000b 0 81 8 1 1 8 11 byte count a data byte 1 a . . . a data byte n a p n0 0 0 0
203 system manageability? 82574 gbe controller 8.5 receive tco flow the 82574 is used as a channel for receivin g packets from the network link and passing them to the external mc. the mc configures the 82574 to pass these specific packets to the mc. once a full packet is received from the link and identified as a manageability packet that should be transferred to the mc, the 82574 starts the receive tco flow to the mc. the 82574 uses the smbus notification method to notify the mc that it has data to deliver. since the packet size might be larger than the maximum smbus fragment size, the packet is divided into fragments, where the 82574 uses the maximum fragment size allowed in each fragment (configured via the nvm). the last fragment of the packet transfer is always the status of the packet. as a result, the packet is transferred in at least two fragments. the data of the pa cket is transferred as part of the receive tco lan packet transaction. when smbus alert is selected as the mc no tification method, the 82574 notifies the mc on each fragment of a multi fragment packet. when asynchronous notify is selected as the mc notification method, the 82574 notifies the mc only on the first fragment of a received packet. it is the mc's responsibilit y to read the full packet including all the fragments. any timeout on the smbus notification results in discarding the entire packet. any nack by the mc causes the fragment to be re-transmitted to the mc on the next receive packet command. the maximum size of the received packet is limited by the 82574 hardware to 1536 bytes. packets larger then 1536 bytes are si lently discarded. any packet smaller than 1536 bytes is processed by the 82574. 8.6 transmit tco flow the 82574 is used as the channel for transmi tting packets from the external mc to the network link. the network packet is transfe rred from the mc over the smbus and then, when fully received by the 82574, is transmitted over the network link. the 82574 supports packets up to an ethern et packet length of 1536 bytes. since smbus transactions can only be up to 240 bytes in length, packets might need to be transferred over the smbus in more than one fragment. this is achieved using the f and l bits in the command number of the transmit tco packet block write command. when the f bit is set, it is the first fragment of the packet. when the l bit is set, it is the last fragment of the packet. when both bits are set, the entire packet is in one fragment. the packet is sent over the netw ork link, only after all its fragments are received correctly over the smbus. the maximum smbus fragment size is defined within the nvm and cannot be changed by the mc. if the packet sent by the mc is larger th an 1536 bytes, than the packet is silently discarded by the 82574. the minimum packet length defined by the 802.3 spec is 64 bytes. the 82574 pads packets that are less than 64 bytes to meet the specification requirements (there is no need for the exte rnal mc to pad packets less than 64 bytes). if the packet sent by the mc is larger th an 1536 bytes the 82574 silently discards the packet. the 82574 calculates the l2 crc on the transmitted packet and adds its four bytes at the end of the packet. any other packet field (such as xsum) must be calculated and inserted by the mc (the 82574 does not ch ange any field in the transmitted packet, other than adding padding and crc bytes).
82574 gbe controller?sy stem manageability 204 if the network link is down when the 82 574 has received the last fragment of the packet from the mc, it silently discards the packet. note that any link down event during the transfer of any packet over the smbus does not stop the operation since the 82574 waits for the last fragment to end to see whether the network link is up again. 8.6.1 transmit errors in sequence handling once a packet is transferred over the smbus from the mc to the 82574, the f and l flags should follow specific rules. the f flag defines that this is the first fragment of the packet; the l flag defines that the transaction contains the last fragment of the packet. flag options during transmit packet transa ctions lists the different flag options in transmit packet transactions: table 53. flag options during transmit packet transactions note: since every other block write command in tco protocol has both f and l flags off, they cause flushing any pending transmit fragme nts that were previously received. when running the tco transmit flow, no other block write transactions are allowed in between the fragments. 8.6.2 tco command aborted flow the 82574 indicates to the mc an error or an abort condition by setting the tco abort bit in the general status. the 82574 might also be configured to send a notification to the mc (see section 8.8.1.3.3 ). following is a list of possible error and abort conditions: ? any error in the smbus protocol (nack, smbus timeouts, etc.). ? any error in compatibility between required protocols to specific functionality (for example, rx enable command with a byte count not equal to 1/14, as defined in the command specification). ? if the 82574 does not have space to store the transmitted packet from the mc (in its internal buffer space) before sending it to the link, the packet is discarded and the external mc is notified via the abort bit. ? error in the f / l bit sequence during multi-fragment transactions. ? an internal reset to the 82574's firmware. previous current action/notes last first accept both. last not first error for the current transaction. current tran saction is discarded and an abort status is asserted. not last first error in previous transaction. previous tran saction (until previous first) is discarded. current packet is processed. no abort status is asserted. not last not first process the current transaction.
205 system manageability? 82574 gbe controller 8.7 smbus arp transactions note: all smbus arp transactions include the pec byte. 8.7.1 prepare to arp this command clears the address resolved flag (set to false). it does not affect the status or validity of the dynamic smbus address and is used to inform all devices that the arp master is starting the arp process: 8.7.2 reset device (general) this command clears the address resolved flag (set to false). it does not affect the status or validity of the dynamic smbus address. 8.7.3 reset device (directed) the command field is nacked if bits 7:1 do not match the current 82574 smbus address. this command clears the address resolved flag (set to false) and does not affect the status or validity of the dynamic smbus address. 8.7.4 assign address this command assigns the 82574 smbus address. the address and command bytes are always acknowledged. the transaction is aborted (nacked) immediat ely if any of the udid bytes is different from the 82574 udid bytes. if successful, the manageability system internally updates the smbus address. this command also sets the address resolved flag (set to true). 1 7 1181 8 11 s slave address wr a command a pec a p 1100 001 0 0 0000 0001 0 [data dependent value] 0 1 7 1181 8 11 s slave address wr a command a pec ap 1100 001 0 0 0000 0010 0 [data dependent value] 0 1711 8 1 8 11 s slave address wr a command a pec a p 1100 001 0 0 targeted slave address | 0 0 [data dependent value] 0 17 11 8 1 8 1 s slave address wr a command a byte count a ? ? ? 1100 001 0 0 0000 0100 0 0001 0001 0
82574 gbe controller?sy stem manageability 206 8.7.5 get udid (general and directed) the general get udid smbus transaction supports a constant command value of 0x03 and in directed, supports a dynamic co mmand value equal to the dynamic smbus address. if the smbus address has been resolved ( address resolved flag set to true), the manageability system does not acknowledge (n ack) this transaction. if its a general command, the manageability system always acknowledges (acks) as a directed transaction. this command does not affect the status or validity of the dynamic smbus address or the address resolved flag. 8 1818181 data 1 a data 2 a data 3 a data 4 a ? ? ? udid byte 15 (msb) 0 udid byte 14 0 udid byte 13 0 udid byte 12 0 8 1 8 18181 data 5 a data 6 a data 7 a data 8 a ? ? ? udid byte 11 0 udid byte 10 0 udid byte 9 0 udid byte 8 0 8 18181 data 9 a data 10 a data 11 a ? ? ? udid byte 7 0 udid byte 6 0 udid byte 5 0 81 8 1 8 181 data 12 a data 13 a data 14 a data 15 a ? ? ? udid byte 4 0 udid byte 3 0 udid byte 2 0 udid byte 1 0 8181811 data 16 a data 17 a pec a p udid byte 0 (lsb) 0 assigned address 0 [data dependent value] 0 s slave address wr a command a s ? ? ? 1100 001 0 0 see below 0
207 system manageability? 82574 gbe controller the get udid command depends on whether or not this is a directed or general command. the general get udid smbus transaction su pports a constant command value of 0x03. the directed get udid smbus transaction su pports a dynamic command value equal to the dynamic smbus address with the lsb bit set. note: bit 0 (lsb) of data byte 17 is always 1b. 71181 slave address rd a byte count a ? ? ? 1100 001 1 0 0001 0001 0 8 1818181 data 1 a data 2 a data 3 a data 4 a ? ? ? udid byte 15 (msb) 0 udid byte 14 0 udid byte 13 0 udid byte 12 0 8 1 8 181 8 1 data 5 a data 6 a data 7 a data 8 a ? ? ? udid byte 11 0 udid byte 10 0 udid byte 9 0 udid byte 8 0 8 1 8 181 data 9 a data 10 a data 11 a ? ? ? udid byte 7 0 udid byte 6 0 udid byte 5 0 81 8 18181 data 12 a data 13 a data 14 a data 15 a ? ? ? udid byte 4 0 udid byte 3 0 udid byte 2 0 udid byte 1 0 8181 8 11 data 16 a data 17 a pec ~? p udid byte 0 (lsb) 0 device slave address 0 [data dependent value] 1
82574 gbe controller?sy stem manageability 208 8.8 smbus pass-through transactions this section details all of the commands (both read and write) that the 82574 smbus interface supports for pass-through. 8.8.1 write transactions this section details the commands that the mc can send to the 82574 over the smbus interface. the smbus write transactions table lists the different smbus write transactions supported by the 82574. 8.8.1.1 transmit packet command note: if the overall packet length is greater than 1536 bytes, the packet is silently discarded by the 82574. 8.8.1.2 request status command an external mc can initiate a request to read the 82574 manageability status by sending a request status command. when received, the 82574 initiates a notification to an external mc (when status is ready), af ter which, an external mc is able to read the status by issuing this command. the format is as follows: 8.8.1.3 receive enable command the receive enable command is a single fragment command used to configure the 82574. this command has two formats: shor t, 1-byte legacy format (providing backward compatibility with previous compon ents) and long, 14-byte advanced format (allowing greater configuration capabilities). the receive enable command format is as follows: tco command transaction command fragmentation section transmit packet block write first: 0x84 middle: 0x04 last: 0x44 multiple 8.8.1.1 transmit packet block write single: 0xc4 single 8.8.1.1 request status block write single: 0xdd single 8.8.1.2 receive enable block write single: 0xca single 8.8.1.3 force tco block write single: 0xcf single 8.8.1.4 management control block write single: 0xc1 single 8.8.1.5 update mng rcv filter parameters block write single: 0xcc single 8.8.1.6 function command byte count data 1 request status 0xdd 1 0
209 system manageability? 82574 gbe controller table 54. receive control byte (data byte) 8.8.1.3.1 management mac address (data bytes 7:2) ignored if the cbdm bit is not set. this mac address is used to configure the dedicated mac address. this mac address is also used when cbdm bit is set in subsequent short versions of this command. function cmd byte count data 1 data 2 ? data 7 data 8 ? data 11 data 12 data 13 data 14 legacy receive enable 0xca 1 receive control byte -?--?- - - - advanced receive enable 14 (0x0e) mac addr lsb mac addr msb ip addr lsb ip addr msb mc smbus addr i/f data byte alert value byte field bit(s) description rcv_en 0 receive tco enable. 0b: disable receive tco packets. 1b: enable receive tco packets. setting this bit enables all manageab ility receive filtering operations. enabling specific filters is done via th e nvm or through special configuration commands. note: when the rcv_en bit is cleared, all receive tco functionality is disabled, not just the packets that are directed to the mc . rcv_all 1 receive all enable. 0b: disable receiving all packets. 1b: enable receiving all packets. forwards all packets received over the wire that passed l2 filtering to the external mc. this flag has no effect if bit 0 (enable tco packets) is disabled. en_sta 2 enable status reporting. 0b: disable status reporting. 1b: enable status reporting. reserved 3 reserved, must be set to 0b nm 5:4 notification method. define the notification method the 82574 uses. 00b: smbus alert. 01b: asynchronous notify. 10b: direct receive. 11b: not supported. reserved 6 reserved. must be set to 1b. cbdm 7 configure the mc dedicated mac address. note: this bit should be 0b when the rcv_en bit (bit 0) is not set. 0b: the 82574 shares the mac address for mng traffic with the host mac address, which is specified in nvm words 0x0-0x2. 1b: the 82574 uses the mc dedicated mac address as a filter for incoming receive packets. the mc mac address is set in bytes 2-7 in this command. if a short version of the command is used, the 82574 uses the mac address configured in the most recent long version of the command in which the cbdm bit was set. when the dedicated mac address feature is activated, the 82574 uses the following registers to filter in all the traffic addressed to the mc mac.
82574 gbe controller?sy stem manageability 210 8.8.1.3.2 management ip address (data bytes 11:8) the 82574 does not support an arp response. as a result, the management ip address field is ignored in the 82574. 8.8.1.3.3 asynchronous notification smbus address (data byte 12) this address is used for the asynchronous notification smbus transaction and for direct receive. 8.8.1.3.4 interface data (data byte 13) interface data byte used in asynchronous notification. 8.8.1.3.5 alert value data (data byte 14) alert value data byte used in asynchronous notification. 8.8.1.4 force tco command this command causes the 82574 to perform a tco reset, if force tco reset is enabled in the nvm. the force tco reset clears the data path (rx/tx) of the 82574 to enable the mc to transmit/receive packets throug h the 82574. this command should only be used when the mc is unable to transmit receive and suspects that the 82574 is inoperable. this command also causes the lan device driver to unload. it is recommended to perform a system re start to resume normal operation. the 82574 considers the force tco command as an indication that the operating system is hung and clears the drv_load flag. the force tco reset command format is as follows: where tco mode is: 8.8.1.5 management control this command is used to set generic manageability parameters. the parameters list is shown in management control command parameters/content. the command is 0xc1 stating that it is a management control command. the first data byte is the parameter number and the data after words (length and content) are parameter specific as shown in management control command parameters/content. function command byte count data 1 force tco reset 0xcf 1 tco mode field bit(s) description do_tco_rst 0 perform tco reset. 0b: do nothing. 1b: perform tco reset. reserved 7:1 reserved (set to 0x00).
211 system manageability? 82574 gbe controller note: if the parameter that the mc sets is no t supported by the 82574. the 82574 does not nack the transaction. after the transact ion ends, the 82574 discards the data and asserts a transaction abort status. the management control command format is as follows: table 55. management control command parameters/content 8.8.1.6 update management receive filter parameters this command is used to set the manageability receive filters parameters. the command is 0xcc. the first data byte is the parameter number and the data that follows (length and content) are parameter specific as listed in management rcv filter parameters. note: if the parameter that the mc sets is not supported by the 82574, then the 82574 does not nack the transaction. after the transa ction ends, the 82574 discards the data and asserts a transaction abort status. the update management rcv receive filter parameters command format is as follows: function command byte count data 1 data 2 ? data n management control 0xc1 n parameter number parameter dependent parameter # parameter data keep phy link up 0x00 a single byte parameter: data 2: bit 0: set to indicate that the phy link for this port should be kept up throughout system resets. this is useful when the server is reset and the mc needs to keep connectivity for a manageability session. bit [7:1] reserved. 0b: disabled. 1b: enabled. function command byte count data 1 data 2 ? data n update manageability filter parameters 0xcc n parameter number parameter dependent
82574 gbe controller?sy stem manageability 212 management rcv filter parameters lists th e different parameters and their content. table 56. management rcv filter parameters parameter number parameter data filters enables 0x1 defines the generic filters configuration. the structure of this parameter is four bytes as the manc register. note: the general filter enable is in th e receive enable command that enables receive filtering. management-to-host configuration 0xa this parameter defines whic h of the packet types identified as manageability packets in the receive path are directed to the host memory. data 5:2 = manc2h register bits. flex filter 0 enable mask and length 0x10 flex filter 0 mask. data 17:2 = mask. bit 0 in data 2 is the first bit of the mask. data 19:18 = reserved. should be set to 00b. date 20 = flexible filter length. flex filter 0 data 0x11 data 2 = group of flex filter?s bytes: 0x0 = bytes 0-29 0x1 = bytes 30-59 0x2 = bytes 60-89 0x3 = bytes 90-119 0x4 = bytes 120-127 data 3:32 = flex filter data bytes. data 3 is lsb. group's length is not a mandatory 30 bytes; it might vary according to filter's length and must not be padded by zeros. flex filter 1 enable mask and length 0x20 same as parameter 0x10 but for filter 1. flex filter 1 data 0x21 same as parameter 0x11 but for filter 1. filters valid 0x60 four bytes to determine which of the 82574 filter registers contain valid data. loaded into the mfval0 and mfval1 re gisters. should be updated after the contents of a filter register are updated. data 2: msb of mfval. ... data 5: lsb of mfval. decision filters 0x61 five bytes are required to load th e manageability decision filters (mdef). data 2: decision filter number. data 3: msb of mdef register for this decision filter. ... data 6: lsb of mdef register for this decision filter. vlan filters 0x62 three bytes are required to load the vlan tag filters. data 2: vlan filter number. data 3: msb of vlan filter. data 4: lsb of vlan filter. flex port filters 0x63 three bytes are required to load the manageability flex port filters. data 2: flex port filter number. data 3: msb of flex port filter. data 4: lsb of flex port filter. ipv4 filters 0x64 five bytes are required to load the ipv4 address filter. data 2: ipv4 address filter number (3:0). data 3: msb of ipv4 address filter. ? data 6: lsb of ipv4 address filter.
213 system manageability? 82574 gbe controller 8.8.2 read transactions (82574 to mc) this section details the pass-through read tr ansactions that the mc can send to the 82574 over the smbus. smbus read transactions lists the different smbus read transactions supported by the 82574. all the read transactions are compatible with smbus read block protocol format. table 57. smbus read transactions 0xc0 or 0xd0 commands are used for more th an one payload. if mc issues these read commands, and the 82574 has no pending data to transfer, it always returns as default opcode 0xdd with the 82574 status and does not nack the transaction. parameter number parameter data ipv6 filters 0x65 17 bytes are required to load the ipv6 address filter. data 2: ipv6 address filter number (3:0). data 3: msb of ipv6 address filter. ? data 18: lsb of ipv6 address filter. mac filters 0x66 seven bytes are required to load the mac address filters. data 2: mac address filters pair number (3:0). data 3: msb of mac address. ? data 8: lsb of mac address. tco command transaction command opcode fragments section receive tco packet block read 0xd0 or 0xc0 first: 0x90 middle: 0x10 last 1 : 0x50 1. the last fragment of the receive tco packet is the packet status. multiple 8.8.2.1 read status block read 0xd0 or 0xc0 or 0xde single: 0xdd single 8.8.2.2 get system mac address block read 0xd4 single: 0xd4 single 8.8.2.3 read management parameters block read 0xd1 single: 0xd1 single 8.8.2.4 read management rcv filter parameters block read 0xcd single: 0xcd single 8.8.2.5 read receive enable configuration block read 0xda single: 0xda single 8.8.2.6
82574 gbe controller?sy stem manageability 214 8.8.2.1 receive tco la n packet transaction the mc uses this command to read packets received on the lan and its status. when the 82574 has a packet to deliver to the mc, it asserts the smbus notification for the mc to read the data (or direct receive). upon receiving notification of the arrival of a lan receive packet, the mc begins issuing a receive tco packet command using the block read protocol. a packet can be transmitted to the mc in at least two fragments (at least one for the packet data and one for the packet status). as a result, mc should follow the f and l bit of the op-code. the op-code can have these values: ? 0x90 - first fragment ?0x10 - middle fragment ? when the opcode is 0x50, this indicates the last fragment of the packet, which contains packet status. if a notification timeout is defined (in the nvm) and the mc does not finish reading the whole packet within the timeout period, since the packet has arrived, the packet is silently discarded. following is the receive tco packet format and the data format returned from the 82574. 8.8.2.1.1 receive tco lan status payload transaction this transaction is the last transaction that the 82574 issues when a packet received from the lan is transferred to the mc. the transaction contains the status of the received packet. the format of the status transaction is as follows: the status is 16 bytes where byte 0 (bits 7:0) is set in data 2 of the status and byte 15 in data 17 of the status. function command receive tco packet 0xc0 or 0xd0 function byte count data 1 (op- code) data 2 ? data n receive tco first fragment n0x90 packet data byte ? packet data byte receive tco middle fragment n0x10 packet data byte receive tco last fragment 0x50 packet data byte function byte count data 1 (op- code) data 2 ? data 17 (status data) receive tco long status 17 (0x11) 0x50 see below
215 system manageability? 82574 gbe controller tco lan packet status data lists the content of the status data. table 58. tco lan packet status data bit descriptions of each field in can be found in section 10.0 . table 59. error status information name bits description packet length 13:0 packet length including crc, only 14 lsb bits. reserved 24:14 reserved. crc 25 crc insert (crc insertion is needed). reserved 28:26 reserved. vext 29 additional vlan present in packet. vp 30 vlan stripped (vlan tag insertion is needed). reserved 33:31 reserved. flow 34 tx/rx packet (packet direction (0b = rx, 1b = tx). lan 35 lan number. reserved 39:36 reserved. reserved 47:40 reserved. vlan 63:48 the two bytes of the 2 header tag. error 71:64 see error status information. status 79:72 see status info. reserved 87:80 reserved. mng status 127:88 this field should be ignored if receiv e tco is not enabled (see management status). field bits description rxe 7 rx data error ipe 6 ipv4 checksum error tcpe 5 tcp/udp checksum error cxe 4 carrier extension error rsv 3 reserved seq 2 sequence error se 1 symbol error ce 0 crc error or alignment error
82574 gbe controller?sy stem manageability 216 table 60. status info table 61. management status field bits description udpv 7 checksum field is valid and contains checksum of udp fragment header ipidv 6 ip identification valid crc32v 5 crc 32 valid bit indicates that the cr c32 check was done and a valid result was found reserved 4 reserved ipcs 3 ipv4 checksum calculated on packet tcpcs 2 tcp checksum calculated on packet udpcs 1 udp checksum calculated on packet reserved 0 reserved name bits description pass rmcp 0x026f 0 set when the udp/tcp port of the manageability packet is 0x26f. pass rmcp 0x0298 1 set when the udp/tcp port of the manageability packet is 0x298. pass mng broadcast 2 set when the manageability packet is a broadcast packet. pass mng neighbor 3 set when the manageability packet neighbor discovery packet. pass arp request/arp response 4 set when the manageability packet is arp response/request packet. reserved 7:5 reserved. pass mng vlan filter index 10:8 reserved. mng vlan address match 11 set when the manageability packet match one of the mng vlan filters. unicast address index 14:12 match an y of the four unicast mac address. unicast address match 15 match any of the four unicast mac address. l4 port filter index 22:16 indicate the flex filter number. l4 port match 23 match any of the udp/tcp port filters. flex tco filter index 26:24 if bit 27 is set, this field indicates which tco filter was matched. flex tco filter match 27 set if a flexible filter matched. ip address index 29:28 ip filter number. (ipv4 or ipv6). ip address match 30 match any of the ip address filters. ipv4/ipv6 match 31 ipv4 match or ipv6 match. this bit is valid only if the bit 30 (ip match bit) or bit 4 (arp match bit) are set. decision filter match 39:32 match decision filter.
217 system manageability? 82574 gbe controller 8.8.2.2 read status command the mc should use this command after receiv ing a notification from the 82574 (such as smbus alert). the 82574 also sends a notificati on to the mc in either of the following two cases: ? the mc asserts a reques t for reading the status. ? the 82574 detects a change in one of the status data 1 bits (and was set to send status to the mc on status change) in the receive enable command. note: commands 0xc0/0xd0 are for backward co mpatibility and can be used for other payloads. the 82574 defines these commands in the opcode as well as which payload this transaction is. when the 0xde command is set, the 82574 always returns opcode 0xdd with the 82574 status. the mc reads the event causing the notification, using the read status command as follows: note: the 82574 response to one of the commands (0 xc0 or 0xd0) in a given time as defined in the smbus notification timeout and flags word in the nvm. status data byte 1 lists the status data byte 1 parameters. function command read status 0xc0 or 0xd0 or 0xde function byte count data 1 (op-code) data 2 (status data 1) data 3 (status data 2) receive tco partial status 30xddsee below
82574 gbe controller?sy stem manageability 218 table 62. status data byte 1 status data byte 2 is used by the mc to in dicate whether the lan device driver is alive and running. the lan device driver valid indication is a bit set by the lan device driver during initialization; the bit is cleared when the lan device driver enters a dx state or is cleared by the hardware on a pci reset. bits 2 and 1 indicate that the lan device driver is stuck. bit 2 indicates whether the interrupt line of the lan function is asserted. bit 1 indicates whether the lan device driver dealt with the interrupt line before the last read status cycle. ta b l e 6 3 lists status data byte 2. bit name description 7 reserved reserved. 6 tco command aborted 1b = a tco command abort event occurred since the last read status cycle. 0b = a tco command abort event did not occur since the last read status cycle. 5 link status indication 0b = lan link down. 1b = lan link up. 4 phy link forced up contains the value of the phy_link_up bit. when set, indica tes that the phy link is configured to keep the link up. 3 initialization indication 0b = an nvm reload event has not occurred since the last read status cycle. 1b = an nvm reload event has occurred since the last read status cycle 1 . 1. this indication is asserted when the 82574 manageability block reloads the nvm and its internal database is updated to the nv m default values. this is an indication that the external mc shoul d reconfigure the 82574, if other values other than the nvm def ault should be configured. 2 reserved reserved. 1:0 power state 00b = dr state. 01b = d0u state. 10b = d0 state. 11b = d3 state 2 . 2. in single-address mode, the 82574 reports the highest power-state mo des in both devices. the "d" state is marked in this orde r: d0, d0u, dr, and d3.
219 system manageability? 82574 gbe controller table 63. status data byte 2 notes: 1. the lan device driver alive indication is se t if one of the lan device drivers is alive. 2. the lan interrupt is considered asserted if one of the interrupt lines is asserted. 3. the icr is considered read if one of the icrs was read (lan 0 or lan 1). status data byte 2 (bits 2 and 1) lists the possible values of bits 2 and 1 and what the mc can assume from the bits: table 64. status data byte 2 (bits 2 and 1) note: the mc reads should consider the time it ta kes for the lan device driver to deal with the interrupt (in ? s). note that excessive reads by the mc can give false indications. bit name description 5 reserved reserved. 4 reserved reserved. 3 driver valid indication 0b = lan driver is not alive. 1b = lan driver is alive. 2 interrupt pending indication 1b = lan interrupt line is asserted. 0b = lan interrupt line is not asserted. 1icr register read/write 1b = icr register was read since the last read status cycle. 0b = icr register was not read since the last read status cycle. reading the icr indicates that the driver has dealt with the interrupt that was asserted. 0 reserved reserved previous current description don?t care 00b interrupt is not pending (ok). 00b 01b new interrupt is asserted (ok). 10b 01b new interrupt is asserted (ok). 11b 01b interrupt is waiting for reading (ok). 01b 01b interrupt is waiting for reading by the driver for more than one read cycle (not ok). possible drive hang state. don?t care 11b previous interrupt was read and current interrupt is pending (ok). don?t care 10b interrupt is not pending (ok).
82574 gbe controller?sy stem manageability 220 8.8.2.3 get system mac address the get system mac address returns the syst em mac address over to the smbus. this command is a single-fragment read block transaction that returns the following data: note: this command returns the mac addr ess configured in nvm offset 0. get system mac address format: data returned from the 82574: 8.8.2.4 read management parameters in order to read the management parame ters the mc should execute two smbus transactions. the first transaction is a block write that sets the parameter that the mc wants to read. the second transaction is block read that reads the parameter. block write transaction: following the block write the mc should issue a block read that reads the parameter that was set in the block write command: data returned from the 82574: the returned data is in the same format of the mc command. note: the parameter that is returned might not be the parameter requested by the mc. the mc should verify the parameter number (def ault parameter to be returned is 0x1). note: if the parameter number is 0xff, it means that the data that was requested from the 82574 is not ready yet.the mc sh ould retry the read transaction. function command get system mac address 0xd4 function byte count data 1 (op- code) data 2 ? data 7 get system mac address 7 0xd4 mac address msb ? mac address lsb function command byte count data 1 management control request 0xc1 1 parameter number function command read management parameter 0xd1 function byte count data 1 (op- code) data 2 data 3 ? data n read management parameter n 0xd1 parameter number parameter dependent
221 system manageability? 82574 gbe controller it is responsibility of the mc to follow the procedure previously defined. when the mc sends a block read command (as previously described) that is not preceded by a block write command with bytecount=1, the 8257 4 sets the parameter number in the read block transaction to be 0xfe. 8.8.2.5 read management re ceive filter parameters in order to read the mng rcv filter para meters, the mc should execute two smbus transactions. the first transaction is a block write that sets the parameter that the mc wants to read. the second transaction is block read that read the parameter. block write transaction: the different parameters supported for this command are the same as the parameters supported for update mng receive filter parameters. following the block write the mc should i ssue a block read that reads the parameter that was set in the block write command: data returned from the 82574: note: the parameter that is returned might not be the parameter requ ested by the mc. the mc should verify the parameter number (def ault parameter to be returned is 0x1). note: if the parameter number is 0xff, it means that the data that was requested from the 82574 should supply is not ready yet. the mc should retry the read transaction. it is mc responsibility to follow the procedur e previously defined. when the mc sends a block read command (as previously described) that is not preceded by a block write command with bytecount=1, the 82574 sets the parameter number in the read block transaction to be 0xfe. function command byte count data 1 data 2 update mng rcv filter parameters 0xcc 1 or 2 parameter number parameter data function command request mng rcv filter parameters 0xcd function byte count data 1 (op- code) data 2 data 3 ? data n read mng rcv filter parameters n0xcd parameter number parameter dependent
82574 gbe controller?sy stem manageability 222 8.8.2.6 read receive en able configuration the mc uses this command to read the receive configuration data. this data can be configured when using receive enable command or through the nvm. read receive enable configuration command fo rmat (smbus read block) is as follows: data returned from the 82574: parameter # parameter data filters enable 0x01 none manc2h configuration 0x0a none flex filter 0 enable mask and length 0x10 none flex filter 0 data 0x11 data 2: group of flex filter?s bytes: 0x0 = bytes 0-29 0x1 = bytes 30-59 0x2 = bytes 60-89 0x3 = bytes 90-119 0x4 = bytes 120-127 flex filter 1 enable mask and length 0x20 none flex filter 1 data 0x21 same as parameter 0x11 but for filter 1. filters valid 0x60 none decision filters 0x61 one byte to define the accessed manageability decision filter (mdef) data 2 ? decision filter number vlan filters 0x62 one byte to define the accessed vlan tag filter (mavtv) data 2 ? vlan filter number flex ports filters 0x63 one byte to define the accessed manageability flex port filter (mfutp). data 2 ? flex port filter number ipv4 filter 0x64 one byte to define the accessed ipv4 address filter (mipaf) data 2 ? ipv4 address filter number ipv6 filters 0x65 one byte to define the accessed ipv6 address filter (mipaf) data 2 ? ipv6 address filter number mac filters 0x66 one byte to define the accessed mac address filters pair (mmal, mmah) data 2 ? mac address filters pair number (0-3) function command read receive enable 0xda
223 system manageability? 82574 gbe controller 8.9 smbus troubleshooting this section outlines the most common issues found while working with pass-through using the smbus sideband interface. 8.9.1 smbus commands are always nack'd by the 82574 there are several reasons why all commands sent to the 82574 from a mc could be nack'd. the following are the most common: ? invalid nvm image - the image itself might be invalid, or it could be a valid image; however, it is not a pass-through image, as such smbus connectivity is disabled. ? the mc is not using the correct smbus address - many mc vendors hard-code the smbus address(es) into their firmware. if the incorrect values are hard-coded, the 82574 does not respond. ? the smbus address(es) can also be dynamically set using the smbus arp mechanism. ? bus interference - the bus connecting the mc and the 82574 might be unstable. 8.9.2 smbus clock speed is 16.6666 khz this can happen when the smbus connecting the mc and the 82574 is also tied into another device (such as an ich) that has a maximum clock speed of 16.6666 khz. the solution is to not connect the smbus betw een the 82574 and the mc to this device. 8.9.3 a network based host application is not receiving any network packets reports have been received about an app lication not receiving any network packets. the application in question was nfs under linux. the problem was that the application was using the rmpc/rmcp+ iana reserved po rt 0x26f (623), and the system was also configured for a shared mac and ip address with the os and mc. the management control to host configuration, in this situation, was setup not to send rmcp traffic to the os (this is typically the correct configuration). this means that no traffic send to port 623 was being routed. the solution in this case is to configur e the problematic application not to use the reserved port 0x26f. 8.9.4 status registers if the nvm image is configured correctly, the physical connections are valid, and problems still exist, use utilities/drivers to check the appropriate 82574 status registers for other indications. function byte count data 1 (op- code) data 2 data 3 ? data 8 data 9 ? data 12 data 13 data 14 data 15 read receive enable 15 (0x0f) 0xda receive control byte mac addr lsb ? mac addr msb ip addr lsb ? ip addr msb mc smbus addr i/f data byte alert value byte
82574 gbe controller?sy stem manageability 224 8.9.4.1 firmware semaphore register (fwsm, 0x5b54) this register (described in detail in the section 10.0 ) provides a way to find out if the firmware on the 82574 is functioning properly and if so, in what mode. check the error indication bits (24:19), if they are anything other than zero, then the firmware is not going to be fully functional, if at all. the most common errors are: ? nvm checksum errors - these can be caused by a number of things: ? mismatch in 82574 stepping and nvm im age version (old nvm image on a new 82574) ? nvm part too small (recommended minimum size for manageability is 32 kb) ? old utility was used to update the nvm (always make sure to have the latest versions) ? invalid firmware mode (0x08) if bits 3:1 of the register indicate a firmware mode that is reserved, this error condition can be reset. always make note of the firmware mode, bits 3:1. in nearly all cases, this value should be set to 010b for pass-through mode to an external mc. the firmware valid bit (15) should be set to 1b to indicate that the firmware is up and running. if it is not set to 1b, then an error code should be indicated in bits 24:19. the reset count bits (18:16) indicate how many times the internal firmware on the 82574 has been reset. this value should be a one (the firmware was reset at power up). if the value is greater than one then there are issues somewhere. note that this counter goes from 0-7 and wraps around. 8.9.4.2 management control register (manc 0x5820) this register is described in detail in the section 10.0 . this register indicates which filters are enab led. it is possible to configure all of the filters yet not enable them, in which case, no management traffic is routed to the mc. or, the mc might be receiving undesired traf fic, such as arp requests when the 82574 was configured to do automatic arp responses. check this register if getting unwanted traffi c or if packets aren?t getting sent to the mc. bit 17 ( receive tco packets enable ) must also be set in order for any packets are sent to a mc. note that it doesn?t matter what the ot her enabled filters are, if this one is off, no packets are sent to the mc. bit 21 ( enable management-to-host ) enables or disables the various filters that also enable manageability traffic (all those that pass the filters in the 82574) to optionally be passed to the operating system. 8.9.5 unable to transmit packets from the mc if the mc has been transmitting and receivin g data without issue for a period of time and then begins to receive nacks from the 82574 when it attempts to write a packet, the problem is most likely due to the fact that the buffers internal to the 82574 are full of data that has been received from the ne twork; however, has yet to be read by the mc.
225 system manageability? 82574 gbe controller being an embedded device, the 82574 has limit ed buffers that it shares for receiving and transmitting data. if a mc does not keep the incoming data read, the 82574 can be filled up, which does not enable the mc to transmit anymore data, resulting in nacks. if this situation occurs, the recommended solution is to have the mc issue a receive enable command to disable anymore incoming data, go read all the data from the 82574 and then use the receive enable command to enable incoming data once again. 8.9.6 smbus fragment size the smbus specification indicates a maximum smbus transaction size of 32 bytes. most of the data passed between the 82574 and the mc over the smbus is rmcp/rmcp+ traffic, which by its very nature (udp traffi c) is significantly larger than 32 bytes in length, thus requiring multiple smbus transa ctions to move a packet from the 82574 to the mc or to send a packet from the mc to the 82574. recognizing this bottleneck, the 82574 can handle up to 240 bytes of data within a single transaction. this is a co nfigurable setting within the nvm. the default value in the nvm images is 32, per the smbus specification. if performance is an issue, it is recommended that you increase this size. during the initialization phas e, the firmware within the 82574 allocates buffers based upon the smbus fragment size setting within the nvm. the 82574 firmware has a finite amount of ram for its use, as such the larger the smbus fragment size, the fewer buffers it can allocate. as such, the mc impl ementation must take care to send data over the smbus in an efficient way. for example, the 82574 firmware has 3 kb of ram it can use for buffering smbus fragments. if the smbus fragment size is 32 bytes then the firmware could allocate 96 buffers of size 32 bytes each. as a result, the mc could then send a large packet of data (such as kvm) that is 800 bytes in size in 25 fragments of size 32 bytes apiece. however, this might not be the most effi cient way because the mc must break the 800 bytes of data into 25 fragments and send each one at a time. if the smbus fragment size is changed to 240 bytes, the 82574 firmware can create 12 buffers of 240 bytes each to receive smbus fragments. the mc can now send that same 800 bytes of kvm data in only four frag ments, which is much more efficient. the problem of changing the smbus fragment size in the nvm is if the mc does not also reflect this change. if a programmer change s the smbus fragment size in the 82574 to 240 bytes and then wants to send 800 bytes of kvm data, the mc can still only send the data in 32 byte fragments. as a result, the firmware runs out of memory. this is because the 82574 firmware create d the 12 buffers of 240 bytes each for fragments, however the mc is only sending fragments of size 32 bytes. this results in a memory waste of 208 bytes per fragment in this case, and when the mc attempts to send more than 12 fragments in a single transaction, the 82574 nacks the smbus transaction due to not enough memory to store the kvm data. in summary, if a programmer increases the size of the smbus fragment size in the nvm, which is recommended for efficiency purposes, take care to ensure that the mc implementation reflects this change and uses that fragment size to its fullest when sending smbus fragments.
82574 gbe controller?sy stem manageability 226 8.9.7 enable xsum filtering if xsum filtering is enabled, the mc does no t need to perform the task of checking this checksum for incoming packets. only packet s that have a valid xsum is passed to the mc, all others are silently discarded. this is a way to offload some work from the mc. 8.9.8 still having problems? if problems still exist, contact your field representative. before contacting, be prepared to provide the following: ? the contents of status registers: ?0x5820 ?0x5860 ?0x5b54 ? a smbus trace if possible ?a dump of the nvm image ? this should be taken from the actual 82574, rather than the nvm image provided by intel. parts of the nvm im age are changed after writing, such as the physical nvm size. this information co uld be key in helping assist in solving an issue. 8.10 nc-si interface the network controller sideband interface (nc-si) is a dmtf industry standard protocol for the sideband interface. nc-si uses a modified version of the industry standard rmii interface for the physical layer as well as defining a new logical layer. the nc-si specification can be found at the dmtf website at: http://www.dmtf.org/ 8.11 overview 8.11.1 terminology the terminology in this document is taken directly from the nc-si specification and is as follows:
227 system manageability? 82574 gbe controller term definition frame versus packet frame is used in reference to ethernet, whereas packet is used everywhere else. external network interface the interface of the network controller that provides connectivity to the external network infrastructure (port). internal host interface the interface of the network controller that provides connectivity to the host os running on the platform. management controller (mc) an intelligent entity comprising of hw/fw/sw, that resides within a platform and is responsible for some or all management functions associated with the platform (mc, service processor, etc.). network controller (nc) the component within a system th at is responsible for providing connectivity to the external ethernet networked world. remote media the capability to allow remote medi a devices to appear as if they were attached locally to the host. network controller sideband interface the interface of the network controller that provides connectivity to a management controller. it can be shorten to sideband interface as appropriate in the context. interface this refers to the entire physical in terface, such as both the transmit and receive interface between the management controller and the network controller. integrated controller the term integrated controller refers to a network controller device that supports two or more channels for nc-si that share a common nc-si physical interface. for example, a network controller that has two or more physical network ports and a single nc-si bus connection. multi-drop multi-drop commonly refers to the case where multiple physical communication devices share an electrically common bus and a single device acts as the master of the bus and communicates with multiple slave or target devices. in nc-si, a management controller serves the role as the master, and the network controllers are the target devices. point-to-point point-to-point commonly refers to the case where only two physical communication devices are interconnected via a physical communication medium. the devices might be in a master/slave relationship, or could be peers. in nc-si, point-to-point operation refers to the situation where only a single management controller and single network controller package are used on the bus in a master/ slave relationship where the management controller is the master. channel the control logic and data paths supporting nc-si pass-through operation on a single network interface (port). a network controller that has multiple network interface ports can support an equivalent number of nc-si channels. package one or more nc-si channels in a network controller that share a common set of electrical buffers and common buffer control for the nc-si bus. typically, there will be a single, logical nc-si package for a single physical network controller pa ckage (chip or module). however, the specification allows a single physical chip or module to hold multiple nc-si logical packages. control traffic/messages/packets command, response and notification packets transmitted between mc and ncs for the purpose of managing nc-si. pass-through traffic/messages/ packets non-control packets passed between the external network and the mc through the nc.
82574 gbe controller?sy stem manageability 228 8.11.2 system topology in nc-si each physical endpoint (nc package) can have several logical slaves (nc channels). nc-si defines that one management controller and up to four network controller packages can be connected to the same nc-si link. figure 48 shows an example topology for a single mc and a single nc package. in this example the nc package has two nc channels. figure 48. single nc package, two nc channels term definition channel arbitration refer to operations where more than one of the network controller channels can be enabled to transmit pass-through packets to the mc at the same time, where arbitration of access to the rxd, crs_dv, and rx_er signal lines is accomplished either by software of hardware means. logically enabled/disabled nc refers to the state of the network controller wherein pass-through traffic is able/unable to flow thro ugh the sideband interface to and from the management controller, as a result of issuing enable/disable channel command. nc rx defined as the direction of ingre ss traffic on the external network controller interface nc tx defined as the direction of egress traffic on the external network controller interface nc-si rx defined as the direction of ingress traffic on the sideband enhanced nc-si interface with respect to the network controller. nc-si tx defined as the direction of egress traffic on the sideband enhanced nc-si interface with respect to the network controller. nc package package id = 0x0 nc channel internal channelid=0x0 nc channel internal channelid=0x1 management controller ( mc ) lan0 lan1 nc-si link
229 system manageability? 82574 gbe controller figure 49 shows an example topology for a sing le mc and two nc packages. in this example, one nc package has two nc channe ls and the other has only one nc channel. figure 49. two nc packages (left, with two nc channels and right, with one nc channel) scenarios in which the nc-si lines are shard by multiple ncs (as shown in figure 49 ) mandate an arbitration mechanism. the arbitration mechanism is described in section 8.15.1 . 8.11.3 data transport since nc-si is based upon the rmii transport la yer, data is transferred in the form of ethernet frames. nc-si defines two types of frames transmitted on the nc-si interface: 1. control frames: a. frames used to configure and control the interface. b. control frames are identified by a unique ethertype in their l2 header. 2. pass-through frames: a. the actual lan pass-through frames transferred from/to the mc. b. pass-through frames are identified as not being a control frame. c. pass-through frames are attributed to a specific nc channel by their source mac address (as configured in the nc by the mc). 8.11.3.1 control frames nc-si control frames are identified by a unique nc-si ethertype (0x88f8). control frames are used in a single-threaded operation, meaning commands are generated only by the mc and can only be sent one at a time. each command from the mc is followed by a single response from the nc (command-response flow), after which the mc is allowed to send a new command. nc package package id = 0x0 nc channel internal channelid=0x0 nc channel internal channelid=0x1 management controller ( mc ) lan0 lan1 nc package package id = 0x1 nc channel internal channelid=0x0 lan nc-si link
82574 gbe controller?sy stem manageability 230 the only exception to the command-response flow is the asynchronous event notification (aen). these control frames are sent unsolicited from the nc to the mc. note: aen functionality by the nc must be disabl ed by default, until activated by the mc using the enable aen commands. in order to be considered a valid command, the control frame must: 1. comply with the nc-si header format. 2. be targeted to a valid channel in the package via the package id and channel id fields. for example, to target a nc channel with pa ckage id of 0x2 and in ternal channel id of 0x5, the mc must set the channel id inside the control frame to 0x45. note: channel id is composed of three bits of pack age id and five bits of internal channel id. 3. contain a correct payload checksum (if used). 4. meet any other condition defined by nc-si. note: there are also commands (such as select pack age) targeted to the package as a whole. these commands must use an internal channel id of 0x1f. for more details, refer to the nc-si specification. 8.11.3.2 nc-si frames receive flow figure 50 shows the overall flow for frames received on the nc from the mc. figure 50. nc-si frames receive flow for the nc nc-si frame received from mc ethertype == nc-si ethertype? process as nc-si control frame yes source mac address == previously configured mac address? no send to lan with matching configured mac address yes no drop frame (belongs to a different package)
231 system manageability? 82574 gbe controller 8.12 nc-si support 8.12.1 supported features the 82574 supports all the mandatory features of the nc-si specification (rev 1.0.0a). ta b l e 6 5 lists the supported commands. ta b l e 6 6 lists the optional features supported. table 65. supported nc-si commands command supported? clear initial state yes get version id yes get parameters 1 1. the link settings field in the get parameters response packet includes the value as defined in the get link status command. yes get controller packet statistics no get link status yes enable channel yes disable channel yes reset channel yes enable vlan yes disable vlan yes enable broadcast yes disable broadcast yes set mac address yes get nc-si statistics yes, partially enable nc-si flow control no disable nc-si flow control no set link command yes enable global multi-cast filter yes, partially disable global multi-cast filter yes get capabilities yes set vlan filters yes aen enable yes get pass-through statistics yes, partially select package yes deselect package yes enable channel network tx yes disable channel network tx yes oem command yes
82574 gbe controller?sy stem manageability 232 8.12.2 nc-si mode - intel specific commands in addition to the regular nc-si command s, the following intel vendor specific commands are supported. the purpose of th ese commands is to provide a means for the mc to access some of the intel-specific features present in the 82574 . 8.12.2.1 overview the following features are available via the nc-si oem specific command: ? get system mac address - this command enables the mc to retrieve the system mac address used by the nc. this mac address can be used for a shared mac address mode. ? tco reset - enables the mc to reset the 82574 . these commands are designed to be compliant with their corresponding smbus commands (if existing). all of the commands are based on a single dmtf defined nc-si command, known as oem command. this command is as follows. table 66. optional nc-si features support feature implement details aens yes, partially report support for all three aen currently defined in the get capabilities command. get nc-si statistics command yes, partially support the following counters: 1- 4, 7. enable/disable global multi-cast filter yes, partially no support for specific multicast filtering. support is to either filter out all multicast packets (enable command) or pass all multicast packets to the mc (disable command). get nc-si pass-through statistics command yes, partially support the following counters: 2. support the following counters only when the os is down: 1, 6, 7. vlan modes yes, partially support only modes 1, 3. buffering capabilities yes 7 kb. mac address filters yes support one mac address as mixed per port. channel count yes support one channel. vlan filters yes support two vlan filters per port. broadcast filters yes support the following filters: ?arp ?dhcp ?net bios set nc-si flow control command no do not support nc-si flow control. hardware arbitration no do not support nc-si hardware arbitration.
233 system manageability? 82574 gbe controller 8.12.2.1.1 oem command (0x50) the oem command can be used by the mc to request the sideband interface to provide vendor-specific information. the vendor en terprise number (ven) is the unique mib/ snmp private enterprise number assigned by iana per organization. vendors are free to define their own internal data structures in the vendor data fields. figure 51. oem command packet format 8.12.2.1.2 oem response (0xd0) following is the vendor specific format for commands, as defined by nc-si. figure 52. oem response packet format 8.12.2.1.3 oem specific comm and response reason codes bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20.. intel command number optional data response code reason code value description value description 0x1 command failed 0x5081 invalid intel command number 0x1 command failed 0x5082 invalid intel command parameter number 0x1 command failed 0x5085 internal network controller error 0x1 command failed 0x5086 invalid vendor enterprise code bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..27 intel command number optional return data
82574 gbe controller?sy stem manageability 234 table 67. commands summary 8.12.2.2 proprietary commands format 8.12.2.2.1 get system mac addre ss command (intel command 0x06) in order to support a system configuratio n that requires the nc to hold the mac address for the mc (such as shared mac address mode), the following command is provided to enable the mc to query the nc for a valid mac address. the nc must return the system mac addre sses. the mc should use the returned mac addressing as a shared mac address by se tting it using the set mac address command as defined in nc-si 1.0. it is also recommended that the mc use packet reduction and manageability-to-host command to set the proper filtering method. 8.12.2.2.2 get system mac address response (intel command 0x06) intel command parameter command name 0x06 n/a get system mac address 0x22 n/a perform tco reset bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20 0x06 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..27 0x06 mac address 28..30 mac address
235 system manageability? 82574 gbe controller 8.12.2.3 set intel management control formats 8.12.2.3.1 set intel management cont rol command (intel command 0x20) where: intel management control 1 is as follows: 8.12.2.3.2 set intel management cont rol response (intel command 0x20) bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..22 0x20 0x00 intel management control 1 bit # default value description 00b enable critical session mode (k eep phy link up and veto bit) 0b - disabled 1b - enabled when critical session mode is enabled, the following behaviors are disabled: ? the phy is not reset on pe_rst# and pcie* resets (in-band and link drop). other reset events are not affected - internal_power_on_reset, device disable, force tco, and phy reset by software. ? the phy does not change its power state. as a result link speed does not change. ? the device does not init iate configuration of the phy to avoid losing link. 1?7 0x0 reserved bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..25 0x20 0x00
82574 gbe controller?sy stem manageability 236 8.12.2.4 get intel management control formats 8.12.2.4.1 get intel management cont rol command (intel command 0x21) where: intel management control 1 is as described in section 8.12.2.3.1 . 8.12.2.4.2 get intel management cont rol response (int el command 0x21) 8.12.2.5 tco reset this command causes the nc to perform tco re set, if force tco reset is enabled in the nvm. if the mc has detected that the operating system is hung and has blocked the rx/tx path, the force tco reset clears the data-pat h (rx/tx) of the nc to enable the mc to transmit/receive packets through the nc. when this command is issued to a channel in a package, it applies only to the specific channel. after successfully performing the command, the nc considers the force tco command as an indication that the operating syst em is hung and clears the drv_load flag (disable the lan device driver). 8.12.2.5.1 perform intel tco rese t command (intel command 0x22) bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..21 0x20 0x00 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..26 0x21 0x00 intel management control 1 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20 0x22
237 system manageability? 82574 gbe controller 8.12.2.5.2 perform inte l tco reset response (intel command 0x22) 8.13 basic nc-si workflows 8.13.1 package states a nc package can be in one of the following two states: 1. selected - in this state, the package is allowed to use the nc-si lines, meaning the nc package might send data to the mc. 2. de-selected - in this state, the package is not allowed to use the nc-si lines, meaning, the nc package cannot send data to the mc. also note that the mc must select no more than one nc package at any given time. package selection can be accomplished in one of two methods: 1. select package command - this command explicitly selects the nc package. 2. any other command targeted to a channel in the package also implicitly selects that nc package. package de-select can be accomplished only by issuing the de-select package command. note: the mc should always issue the select pa ckage command as the first command to the package before issuing channel-specific commands. for further details on package selection, refer to the nc-si specification. bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..26 0x22
82574 gbe controller?sy stem manageability 238 8.13.2 channel states a nc channel can be in one of the following states: 1. initial state - in this state, the channel only accepts the clear initial state command (the package also accepts the select package and de-select package commands). 2. active state - this is the normal operational mode. all commands are accepted. for normal operation mode, the mc should always send the clear initial state command as the first command to the channel. 8.13.3 discovery after interface power-up, the mc should perform a discovery process to discover the ncs that are connected to it. this process should include an algorithm similar to the following: 1. for package_id=0x0 to max_package_id a. issue select package command to package id package_id b. if a response was received then for internal_channel_id = 0x0 to max_internal_channel_id issue a clear initial state command for package_id | internal_channel_id (the combination of package_id and internal _channel_id to create the channel id). if a response was received then consider internal_channel_id as a valid channel for the package_id package the mc can now optionally discover channel capabilities and version id for the channel else (if not a response was not received, then issue a clear initial state command three times. issue a de-select package command to the package (and continue to the next package). c. else, if a response was not received, i ssue a select packet command three times. 8.13.4 configurations this section details different configuratio ns that should be performed by the mc. it is considered a good practice that the mc does not consider any configuration valid unless the mc has explicitly configured it afte r every reset (entry into the initial state). as a result, it is recommended that the mc re-configure everything at power-up and channel/package resets.
239 system manageability? 82574 gbe controller 8.13.4.1 nc capabilities advertisement nc-si defines the get capabilities command. it is recommended that the mc use this command and verify that the capabilities match its requirements before performing any configurations. for example, the mc should verify that the nc supports a specific aen before enabling it. 8.13.4.2 receive filtering in order to receive traffic, the mc must co nfigure the nc with receive filtering rules. these rules are checked on every packet rece ived on the lan interface (such as from the network). only if the ru les matched, will the packet be forwarded to the mc. 8.13.4.2.1 mac address filtering nc-si defines three types of mac address filters: unicast, multicast and broadcast. to be received (not dropped) a packet must match at least one of these filters. note: the mc should set one mac address using the set mac address command and enable broadcast and global multicast filtering. unicast/exact match (set mac address command) this filter filters on specific 48-bit mac a ddresses. the mc must configure this filter with a dedicated mac address. note: the nc might expose three types of unicas t/exact match filters (such as mac filters that match on the entire 48 bits of the mac address): unicast, multicast and mixed. the 82574 exposes two mixed filters, whic h might be used both for unicast and multicast filtering. the mc should us e one mixed filter for its mac address. refer to nc-si specification - set mac address for further details. broadcast (enable/disable broadcast filter command) nc-si defines a broadcast filtering mechanism which has the following states: 1. enabled - all broadcast traffic is blocked (not forwarded) to the mc, except for specific filters (such as arp request, dhcp, and netbios). 2. disabled - all broadcast traffic is fo rwarded to the mc, with no exceptions. note: refer to nc-si specification enable/disable broadcast filter command. global multicast (enable/disable global multicast filter) nc-si defines a multicast filtering mechanism which has the following states: 1. enabled - all multicast traffic is blocked (not forwarded) to the mc. 2. disabled - all multicast traffic is forwarded to the mc, with no exceptions. the recommended operational mode is enabled, with specific filters set. note: not all multicast filtering modes are necessarily supported. refer to nc-si specification enable/disable global multicast filter command for further details.
82574 gbe controller?sy stem manageability 240 8.13.4.2.2 vlan nc-si defines the following vlan work modes: refer to nc-si specification - enable vlan command for further details. the 82574 only supports modes #1 and #3. recommendation: 1. modes: a. if vlan is not required - use the disabled mode. b. if vlan is required - use the enabled #1 mode. 2. if enabling vlan, the mc should also set the active vlan id filters using the nc-si set vlan filter command prior to setting the vlan mode. 8.13.5 pass-through traffic states the mc has independent, separate controls for enablement states of the receive (from lan) and of the transmit (to lan) pass-through paths. 8.13.5.1 channel enable this mode controls the state of the receive path: 1. disabled: the channel does not pass any traffic from the network to the mc. 2. enabled: the channel passes any traf fic from the network (that matched the configured filters) to the mc. note: this state also affects aens: aens is only sent in the enabled state. note: the default state is disabled. note: it is recommended that the mc complete a ll filtering configuration before enabling the channel. mode command and name descriptions disabled disable vlan command in this mode, no vlan frames are received. enabled #1 enable vlan command with vlan only in this mode, only packets that matched a vlan filter are forwarded to the mc. enabled #2 enable vlan command with vlan only + non-vlan in this mode, packets from mode 1 + non-vlan packets are forwarded. enabled #3 enable vlan command with any-vlan + non-vlan in this mode, packets are forwarded regardless of their vlan state.
241 system manageability? 82574 gbe controller 8.13.5.2 network transmit enable this mode controls the state of the transmit path: 1. disabled - the channel does not pass any traffic from the mc to the network. 2. enabled - the channel passes any traffic from the mc (that matched the source mac address filters) to the network. note: the default state is disabled. note: the nc filters pass-through packets accordin g to their source mac address. the nc tries to match that source mac address to one of the mac addresses configured by the set mac address command. as a result, the mc should enable network transmit only after configuring the mac address. note: it is recommended that the mc complete all filtering configuration (especially mac addresses) before enabling the network transmit. note: this feature can be used for fail-over scenarios. see section 8.15.3 . 8.13.6 asynchronous event notifications the asynchronous event notifications are unsolicited messages sent from the nc to the mc to report status changes (such as link change, operating system state change, etc.). recommendations: ? the mc firmware designer should use aens . to do so, the designer must take into account the possibility that a nc-si response frame (such as a frame with the nc- si ethertype), arrives out-of-context (not immediately after a command, but rather after an out-of-context aen). ? to enable aens, the mc should first quer y which aens are supported, using the get capabilities command, then enable de sired aen(s) using the enable aen command, and only then enable the cha nnel using the enable channel command. 8.13.7 querying active parameters the mc can use the get parameters command to query the current status of the operational parameters.
82574 gbe controller?sy stem manageability 242 8.14 resets in nc-si there are two types of resets defined: 1. synchronous entry into the initial state. 2. asynchronous entry into the initial state. recommendations: ? it is very important that the mc firmware designer keep in mind that following any type of reset, all configurations are considered as lost and thus the mc must re- configure everything. ? as an asynchronous entry into the initial state might not be reported and/or explicitly noticed, the mc should period ically poll the nc with nc-si commands (such as get version id, get parameters, etc.) to verify that the channel is not in the initial state. should the nc channel re spond to the command with a clear initial state command expected reason code - the mc should consider the channel (and most probably the entire nc package) as if it underwent a (possibly unexpected) reset event. thus, the mc should re-configure the nc. see the nc-si specification section on detecting pass-through traffic interruption. ? the intel recommended polling interval is 2-3 seconds. for exact details on the resets, refer to nc-si specification. 8.15 advanced workflows 8.15.1 multi-nc arbitration as described in section 8.11.2 , in a multi-nc environment, there is a need to arbitrate the nc-si lines. figure 53 shows the system topology of such an environment. figure 53. multi-nc environment nc package1 channel1: 0x0 channel2: 0x1 nc package2 channel1: 0x0 mc nc-si tx lines hw-arbitration lines nc-si rx lines
243 system manageability? 82574 gbe controller see figure 53 . the nc-si rx lines are shared betw een the ncs. to enable sharing of the nc-si rx lines, nc-si has defined an arbitration scheme. the arbitration scheme mandates that only one nc package can use the nc-si rx lines at any given time. the nc package that is allowed to use these lines is defined as selected. all the other nc packages are de-selected. nc-si has defined two mechanisms for the arbitration scheme: 1. package selection by the mc. in this mechanism, the mc is responsible for arbitrating between the packages by issuing nc-si commands (select/de-select package). the mc is responsible for having only one package selected at any given time. 2. hardware arbitration. in this mechanism, two additional pins on each nc package are used to synchronize the nc package. each nc package has an arb_in and arb_out line and these lines are used to transfer tokens. a nc package that has a token is considered selected. note: hardware arbitration is enabled by default after interface power up. note: the 82574 does not support hardware arbitration. for further details, refer to section 4 in the nc-si specification. 8.15.1.1 package select ion sequence example following is an example work flow for a mc and occurs after the discovery, initialization, and configuration. assuming the mc needs to share the nc-s i bus between packages the mc should: 1. define a time-slot for each device. 2. discover, initialize, and configure all the nc packages and channels. 3. issue a de-select package command to all the channels. 4. set active_package to 0x0 (or the lowest existing package id). 5. at the beginning of each time slot the mc should: a. issue a de-select package to the active_package. the mc must then wait for a response and then an additional timeou t for the package to become de-selected (200 ? s). see the nc-si specification table 10 - parameter nc deselect to hi-z interval. b. find the next available package (typic ally active_package = active_package + 1). c. issue a select package command to active_package. 8.15.2 external link control the mc can use the nc-si set link command to control the external interface link settings. this command enables the mc to se t the auto-negotiation, link speed, duplex, and other parameters. this command is only available when the host operating system is not present. indicating the host operating system status can be obtained via the get link status command and/or host os status change aen command.
82574 gbe controller?sy stem manageability 244 recommendation: ? unless explicitly needed, it is not recommended to use this feature. the nc-si set link command does not expose all the possible link settings and/or features. this might cause issues under different scenarios. even if decided to use this feature, it is recommended to use it only if the link is down (trust the 82574 until proven otherwise). ? it is recommended that the mc first query the link status using the get link status command. the mc should then use this data as a basis and change only the needed parameters when issuing the set link command. for further details, refer to the nc-si specification. 8.15.2.1 set link while lan pcie functionality is disabled in cases where the 82574 is used solely for manageability and its lan pcie function is disabled, using the nc-si set link command while advertising multiple speeds and enabling auto-negotiation results in the lowest possible speed chosen. to enable link of higher a speed, the mc sh ould not advertise speeds that are below the desired link speed, as the lowest advertised link speed is chosen. when the 82574 is only used for manageability and the link speed advertisement is configured by the mc, changes in the power st ate of the lan device is not effected and the link speed is not re-negotiated by the lan device. 8.15.3 statistics the mc might use the statistics commands as defined in nc-si. these counters are meant mostly for debug purposes and are not all supported. the statistics are divided into three commands: 1. controller statistics - these are statistics on the primary interface (to the host operating system). see the nc-si specification for details. 2. nc-si statistics - these are statistics on the nc-si control frames (such as commands, responses, aens, etc.). se e the nc-si specification for details. 3. nc-si pass-through statistics - these are statistics on the nc-si pass-through frames. see the nc-si specification for details.
245 system manageability? 82574 gbe controller note: this page intentionally left blank.
82574 gbe controller?programing interface 246 9.0 programing interface 9.1 pcie configuration space 9.1.1 pcie compatibility pcie is completely compatible with existi ng deployed pci software. to achieve this, pcie hardware implementations conform to the following requirements: ? all devices required to be supported by the deployed pci software must be enumerable as part of a tree throug h pci device enumeration mechanisms. ? devices must not require any resources (such as address decode ranges and interrupts) beyond those claimed by pci resources for operation of software compatible and software transparent features with respect to existing deployed pci software. ? devices in their default operating state must conform to pci ordering and cache coherency rules from a software viewpoint. ? pcie devices must conform to pci power management specification. pcie devices must not require any register prog ramming for pci-compatible power management, beyond those available through pci power management capability registers. power management is expected to conform to standard pci power management using existing pci bus drivers. pcie devices implement all registers required by the pci specification as well as the power management registers and capability pointers specified by the pci power management specification. in addition, pc ie defines a pcie capability pointer to indicate support for pcie extens ions and associated capabilities. note: the 82574 is a single function device - the lan function. the 82574 contains the following regions of the pci configuration space: ? mandatory pci configuration registers ? power management capabilities ? msi capabilities ? msi-x capabilities ? pcie extended capabilities
247 programing interface?82574 gbe controller 9.1.2 mandatory pci configuration registers the pci configuration registers map is depicted below. see a detailed description for registers loaded from the nvm at initialization time. initialization values of the configuration registers are marked in parenthesis. color notation in figure 54 : light blue read-only fields dark grey not used. hardwired to zero. configuration registers are assigned one of the attributes described in ta b l e 6 8 . table 68. r/w attribute table r/w attribute description ro read-only register: register bits are re ad-only and cannot be altered by software. rw read-write register: register bits are read-write and can be either set or reset. r/w1c read-only status, write-1-to-clear status register, writing a 0b to r/w1c bits has no effect. ros read-only register with sticky bits: register bits are read-only and cannot be altered by software. bits are not cleared by reset and can only be reset with the pwrgood signal. devices that consume aux power are not allowed to reset sticky bits when aux power consumption (either via aux power or pme enable) is enabled. rws read-write register with sticky bits: register bits are read-write and can be either set or reset by software to the desired state. bits are not cleared by reset and can only be reset with the pwrgood signal. devices that consume aux powe r are not allowed to reset sticky bits when aux power consumption (either via aux power or pme enable) is enabled. r/w1cs read-only status, write-1-to-clear status register with sticky bits: register bits indicate status when read, a set bit indicating a status event ca n be cleared by writing a 1b. writing a 0b to r/ w1c bits has no effect. bits are not cleared by reset and can only be reset with the pwrgood signal. devices that consume aux power are not allowed to reset sticky bits when aux power consumption (either via aux power or pme enable) is enabled. hwinit hardware initialized: register bits are initialized by firmware or hardware mechanisms such as pin strapping or serial nvm. bits are read-only after initialization and can only be reset (for write-once by firmware) with pwrgood signal. rsvdp reserved and preserved: reserved for future r/w implementations; software must preserve value read for writes to bits. rsvdz reserved and zero: reserved for future r/w1 c implementations; software must use 0b for writes to bits. byte offset byte 3 byte 2 byte 1 byte 0 0x0 device id vendor id (0x8086) 0x4 status register (0x0010) command register (0x0000) 0x8 class code (0x020000) revision id (0x00) 0xc bist (0x00) header type (0x00 | 0x80) latency timer (0x00) cache line size (0x10) 0x10 base address 0 0x14 base address 1 0x18 base address 2 0x1c base address 3 0x20 base address 4 0x24 base address 5
82574 gbe controller?programing interface 248 figure 54. pci-compatible configuration registers explanation of the various registers in the 82574 is as follows. 9.1.2.1 vendor id (offset 0x0) this is a read-only register that has the same value for all pci functions. it uniquely identifies intel products. the field default value is 0x8086. 9.1.2.2 device id (offset 0x2) this is a read-only register. the value is lo aded from nvm. default value is 0x10d3 for the 82574. 9.1.2.3 command reg (offset 0x4) read-write register. layout is as follo ws. shaded bits are not used by this implementation and are hardwired to 0b. 0x28 cardbus cis pointer (0x00000000) 0x2c subsystem id (0x0000) subsystem vendor id (0x8086) 0x30 expansion rom base address 0x34 reserved (0x000000) cap_ptr (0xc8) 0x38 reserved (0x00000000) 0x3c max_latency (0x00) min_grant (0x00) interrupt pin (0x01) interrupt line (0x00) byte offset byte 3 byte 2 byte 1 byte 0 pci function default value nvm address meaning lan 0x10d3 0dh 10/100/1000mbit ethernet controller, x1 pcie, copper bit(s) init value description 0 0b i/o access enable. 1 0b memory access enable. 2 0b enable mastering lan r/w field. 3 0b special cycle monitoring ? hardwired to 0b. 4 0b mwi enable ? hardwired to 0b. 5 0b palette snoop enable ? hardwired to 0b. 6 0b parity error response. 7 0b wait cycle enable ? hardwired to 0b. 8 0b serr# enable. 9 0b fast back-to-back enable ? hardwired to 0b. 10 0b interrupt disable controls the ability of a pcie device to generate a legacy interrupt message. when set, the device can?t generate legacy interrupt messages . 15:11 0b reserved
249 programing interface?82574 gbe controller 9.1.2.4 status register (offset 0x6) shaded fields are not used by this implementation and are hardwired to 0b. 9.1.2.5 revision id (offset 0x8) the default revision id of this device is 0x0. the value of the rev id is a logic xor between the default value and th e value in the nvm word 0x1e. 9.1.2.6 class code (offset 0x9) the class code is a read-only, hard-coded value that identifies the device functionality. lan - 0x020000 - ethernet adapter 9.1.2.7 cache line size (offset 0xc) this field is implemented by pcie devices as a read-write field for legacy compatibility purposes but has no impact on any pcie de vice functionality. loaded from nvm words 0x1a. 9.1.2.8 latency timer (offset 0xd) not used. hardwired to 0b. 9.1.2.9 header type (offset 0xe) this indicates if a device is single function or multifunction. for the 82574 this field has a value of 0x00 to indicate a single function device. bits initial value r/w description 2:0 000b reserved 30b ro interrupt status 1 41b ro new capabilities indicates that a device implemen ts extended capabilities. the 82574 sets this bit, and implements a capabilities list, to indicate that it supports pc i power management, message signaled interrupts, and the pcie extensions. 5 0b 66mhz capable ? hardwired to 0b. 6 0b reserved. 7 0b fast back-to-back capable ? hardwired to 0b. 8 0b r/w1c data parity reported. 10:9 00b devsel timing ? hardwired to 0b. 11 0 r/w1c signaled target abort. 12 0bb r/w1c received target abort. 13 0b r/w1c received master abort. 14 0b r/w1c signaled system error. 15 0b r/w1c detected parity error. 1. the interrupt status field is a read-only field that in dicates that an interrupt message is pending internally to the device.
82574 gbe controller?programing interface 250 9.1.2.10 base address registers (offset 0x10 - 0x27) the base address registers (bars) are used to map the 82574 register space. the 82574 bars are defined as non-prefetchable , and therefore support 32-bit addressing only. note: flash size is defined by the nvm. note: the default setting of the flash bar enables software implement initial programming of empty (non-valid) flash via the (parallel) flash bar. note: the 82574 requests i/o resources to support pre-boot operation (prior to allocating physical memory base addresses). all bars have the following fields: bar addr. 31 4 3 2 1 0 0 0x10 memory bar (r/w - 31:17; 0b - 16:4) 0b 00b 0b 1 0x14 flash bar (r/w - 31:23/16; 0b - 22/15:4) 0b 00b 0b 2 0x18 io bar (r/w - 31:5; 0b - 4:1) 0b 1b 3 0x1c msi-x bar (r/w - 31:14; 0b - 13:4) 0b 00b 0b 4 0x20 reserved (read as all 0b?s) 5 0x24 reserved (read as all 0b?s) field bit(s) r/w initial value description mem 0 r 0b for memory 1b for i/o 0b = memory space 1b = i/o space. mem type 2:1 r 00b (for 32-bit) indicates the address space size. 00b = 32-bit 10b = 64-bit the 82574 bars are 32-bit only. prefetch mem 3r0b 0b = non-prefetchable space. 1b = prefetchable space. the 82574 implements non-prefetchable space since it has read side effects. memory address space 31:4 r/w 0x0 read/write bits and hardwired to 0b depending on the memory mapping window sizes: lan memory spaces are 128 kb. lan flash spaces can be 64 kb and up to 4 mb in powers of 2. msi-x memory space is 16 kb. flash window size is set by the nvm. the flash bar can also be disabled by the nvm. io address space 31:2 r/w 0x0 read/write bits and hardwired to 0b depending on the i/o mapping window sizes: lan i/o space is 32 bytes.
251 programing interface?82574 gbe controller memory and i/o mapping: 9.1.2.11 cardbus cis (offset 0x28) not used. hardwired to 0b. 9.1.2.12 subsystem id (offset 0x2e) this value can be loaded automatically from the nvm at power up with a default value of 0x0000. 9.1.2.13 subsystem vendor id (offset 0x2c) this value can be loaded automatically from the nvm address 0x0c at power up or reset. the default value is 0x8086 at power up. 9.1.2.14 expansion rom base address (offset 0x30) this register is used to define the addre ss and size information for boot-time access to the optional flash memory. the bar size and enablement are set by the nvm. mapping window mapping description memory bar 0 the internal registers and memories are accessed as direct memory mapped offsets from the base address register. software can access byte, word or dword. flash bar 1 the external flash can be accessed using direct memory mapped offsets from the flash base address register. software can access byte, word or dword. the flash bar is enabled by the dislfb field in nvm word 0x21. i/o bar 2 all internal registers, memories, and flash can be accessed using i/o operations. there are two 4-byte registers in the i/o mapping window: addr reg and data reg. software can access byte, word or dword. msi-x bar 3 the internal registers and memories are accessed as direct memory mapped offsets from the base address register. software accesses are dword. field bit(s) read/ write initial value description en 0 r/w 0b 1b = enables expansion rom access. 0b = disables expansion rom access. reserved 10:1 r 0x0 always read as 0b. writes are ignored. address 31:11 r/w 0x0 read/write bits and hardwired to 0b depending on the memory mapping window size as defined in word 0x21 in the nvm.
82574 gbe controller?programing interface 252 9.1.2.15 cap_ptr (offset 0x34) the capabilities pointer field (cap_ptr) is an 8-bit field that provides an offset in the device's pci configuration space for the loca tion of the first item in the capabilities linked list. the 82574 sets this bit, and implem ents a capabilities list, to indicate that it supports: ? pci power management ?msi ?msi-x ? pcie extended capabilities its value, 0xc8, is the address of the first entry: pci power management. 9.1.2.16 interrupt line (offset 0x3c) read/write register programmed by software to indicate which of the system interrupt request lines this device's interrupt pin is bound to. see the pci definition for more details. 9.1.2.17 interrupt pin (offset 0x3d) read-only register. the lan implements legacy interrupt on inta. 9.1.2.18 max_lat/min_gnt (offset 0x3e) not used. hardwired to 0b. 9.1.3 pci power management registers all fields are reset on full power up. all of the fields except pme_en and pme_status are reset on exit from d3cold state. see the detailed description for registers lo aded from the nvm at initialization time. initialization values of the configuration registers are marked in parenthesis. some fields in this section depend on the power management ena bits in the nvm word 0x0a. ta b l e 6 9 lists the organization of the pci power management register block. light-blue fields are read only fields. table 69. power management register block address item next pointer 0xc8-cf pci power management 0xd0 0xd0-df msi 0xe0 0xa0-ab msi-x 0x00 0xe0-f3 pcie capabilities 0xa0 byte offset byte 3 byte 2 byte 1 byte 0 0xc8 power management capabilities (pmc) next pointer (0xd0) capability id (0x01) 0xcc data pmcsr_bse bridge support extensions power management control / status register (pmcsr)
253 programing interface?82574 gbe controller the following section describes the register definitions, whether they are required or optional for compliance, and how they are implemented in the 82574. 9.1.3.1 capability id, offset 0xc8, (ro) this field equals 0x01 indicating the linke d list item is the pci power management registers. 9.1.3.2 next pointer, offset 0xc9, (ro) this field provides an offset to the next capability item in the capability list. its value of 0xd0 points to the msi capability. 9.1.3.3 power management capabilities (pmc), offset 0xca, (ro) this field describes the device functionality at the power management states as described in the following table. figure 55. power management capabilities (pmc) bits default r/w description 15:11 see value in description column ro pme_support this five-bit field indicates the powe r states in which the function might assert pme# depending on nvm settings: 00000b = if pm is disabled in nvm (w ord 0x0a) than no pme support at all states. 01001b = if pm is enabled in nv m and no aux_pwr than pme is supported at d0 and d3 hot. 11001b = if pm is enabled in nvm an d aux_pwr, then pme is supported at d0, d3 hot and d3 cold. 10 0b ro d2_support the 82574 does not support d2 state 90b ro d1_support the 82574 does not support d1 state 8:6 000b ro aux current required current defined in the data register 51b ro dsi the 82574 requires its software device driver to be executed following transition to the d0 uninitialized state. 4 0b ro reserved 30b ro pme_clock disabled. hardwired to 0b. 2:0 010b ro version the 82574 complies with pci pm spec revision 1.1.
82574 gbe controller?programing interface 254 9.1.3.4 power management control/status register - (pmcsr), offset 0xcc, (rw) this register is used to control and monitor power management events in the 82574. figure 56. power management control/status - pmcsr 9.1.3.5 pmcsr_bse bridge suppor t extensions, offset 0xce, (ro) this register is not implemented in the 82574, values set to 0x00. bits default rd/wr description 15 0b at power up r/w1c pme_status this bit is set to 1b when the function detects a wake-up event independent of the state of the pme_en bit. writing a 1b clears this bit. 14:13 see value in data register ro data_scale this field indicates the scaling factor to be used when interpreting the value of the data register. if the pm is enabled in the nvm, and the data_select field is set to 0, 3, 4 or7, than this field equals 01b (indicating 0.1 watt units). else it equals 00b. 12:9 0000b r/w data_select this four-bit field is used to select which data is to be reported through the data register and data_scale field. these bits are writeable only when power management is enabled via the nvm. 8 0b at power up r/w pme_en if power management is enabled in the nvm, writing a 1b to this register enables wake up. if power management is disabled in the nvm, writing a 1b to this bit has no affect, and does not set the bit to 1b. 7:4 000000b ro reserved the 82574 returns a value of 000000b for this field. 30b ro no_soft_reset this bit is always set to 0b to indicate that the 82574 performs an internal reset upon transitioning from d3hot to d0 via software control of the powerstate bits. configuration context is lost when performing the soft reset. upon transition from the d3hot to the d0 state, a full re- initialization sequence is needed to return the 82574 to the d0 initialized state. 2 0b ro reserved 1:0 00b r/w power state this field is used to set and re port the power state of the 82574 as follows: 00b = d0. 01b = d1 (cycle ignored if written with this value). 10b = d2 (cycle ignored if written with this value). 11b = d3 (cycle ignored if pm is not enabled in the nvm).
255 programing interface?82574 gbe controller 9.1.3.6 data register, offset 0xcf, (ro) this optional register is used to report power consumption and heat dissipation. reported register is controlled by the data_select field in the pmcsr and the power scale is reported in the data_scale field in the pmcsr. the data of this field is loaded from the nvm if power management is enable d in the nvm. otherwise, it has a default value of 0x00. the values for the 82574 are as follows: for other data_select values the data register output is reserved (0b). 9.1.4 message signaled interrupt (msi) configuration registers this structure is required for pcie devices. initialization values of the configuration registers are marked in parenthesis. light-blue fields represent read-only fields. note: there are no changes to this structure from the pci 2.2 specification. figure 57. msi configuration registers 9.1.4.1 capability id, offset 0xd0, (ro) this field equals 0x05 indicating the lin ked list item as being the ms registers. 9.1.4.2 next pointer, offset 0xd1, (ro) this field provides an offset to the next capability item in the capability list. its value of 0xe0 points to the pcie capability. function d0 (consume/dissipate) d3 (consume/dissipate) data select (0x0/0x4) (0x3/0x7) function 0 eeprom address 0x22 eeprom address 0x22 byte offset byte 3 byte 2 byte 1 byte 0 0xd0 message control (0x0080) next pointer (0xe0) capability id (0x05) 0xd4 message address 0xd8 message upper address 0xdc reserved message data
82574 gbe controller?programing interface 256 9.1.4.3 message control offset 0xd2, (r/w) the register fields are listed in the following table. 9.1.4.4 message address low offset 0xd4, (r/w) written by the system to indicate the lower 32 bits of the address to use for the msi memory write transaction. the lower two bits always returns 0b regardless of the write operation. 9.1.4.5 message address high, offset 0xd8, (r/w) written by the system to indicate the upper 32 bits of the address to use for the msi memory write transaction. 9.1.4.6 message data, offset 0xdc, (r/w) written by the system to indicate the lowe r 16 bits of the data written in the msi memory write dword transaction. the upper 16 bits of the transaction are written as 0b. 9.1.5 msi-x configuration the msi-x capability structure is listed in ta b l e 7 0 . the 82574 is permitted to have both an msi and an msi-x capability structure. in contrast to the msi capability structure, which directly contains all of the control/ status information for the function's vect ors, the msi-x capability structure instead points to an msi-x table structure and a msi-x pending bit array (pba) structure, each residing in memory space. each structure is mapped by a bar belongin g to the 82574, located beginning at 0x10 in the configuration space. a bar indicator register (bir) indicates which bar and a qword-aligned offset indicates where the structure begins relative to the base address associated with the bar. the bar is permitte d to be either 32-bit or 64-bit, but must map memory space. the 82574 is permitted to map both structures with the same bar, or to map each structure with a different bar. the msi-x table structure, detailed in section 10.2.10 typically contains multiple entries, each consisting of several fields : message address, message upper address, message data, and vector control. each entry is capable of specifying a unique vector. the pending bit array (pba) structure, shown in the same section, contains the function's pending bits, one per table entry, organized as a packed array of bits within qwords. bits default r/w description 00b r/w msi enable if set to 1b, msi. in this case, the 82574 generates msi for interrupt assertion instead of intx signaling. 3:1 000b ro multiple message capable the 82574 indicates a single requested message. 6:4 000b ro multiple message enable the 82574 returns 000b to indicate th at it supports a single message. 71b ro 64-bit capable. a value of 1b indicates that the 82574 is capable of generating 64-bit message addresses. 15:8 0x0 ro reserved, reads as 0b.
257 programing interface?82574 gbe controller the last qword is not necessarily fully populated. table 70. msi-x capability structure 9.1.5.1 capability id, offset 0xa0 (ro) this field equals 0x11 indicating the linke d list item as being the msi-x registers. 9.1.5.2 next pointer, offset 0xa1 (ro) this field provides an offset to the next capa bility item in the capability list. its value is 0x00 indicating that this is the last capability. 9.1.5.3 message control, offset 0xa2 (r/w) the register fields are listed in the following table. table 71. msi-x message control field byte offset byte 3 byte 2 byte 1 byte 0 0xa0 message control (0x00090) next pointer (0x00) capability id (0x11) 0xa4 table offset table bir 0xa8 pba offset pba bir field bits default r/w description ts 10:0 0x001 1 1. default value is read from the nvm ro ta b l e s i z e system software reads this field to determine the msi-x table size n, which is encoded as n-1. for example, a returned value of 0x00000001111 indicates a table size of 16. rsv 13:11 0b ro always return 0b on read. write operation has no effect. fm 14 0b r/w function mask if set to 1b, all of the vectors associated with the function are masked, regardless of their per-vector mask bit states. if set to 0b, each vector?s mask bit determines whether the vector is masked or not. setting or clearing the msi-x function mask bit has no effect on the state of the per-vector mask bits. en 15 0b r/w msi-x enable if set to 1b and the msi enable bit in the msi message control register is 0b, the function is permitted to use msi-x to request service and is prohibited from using its intx# pin. system configuration software sets this bit to enable msi-x. a software device driver is prohibited from writing this bit to mask a function?s service request. if 0b, the function is prohibited from using msi-x to request service.
82574 gbe controller?programing interface 258 9.1.5.4 table offset, offset 0xa4 (r/w) table 72. msi-x table offset 9.1.5.5 pba offset, offset 0xa8 (r/w) table 73. msi-x pba offset to request service using a given msi-x ta ble entry, a function performs a dword memory write transaction using the contents of the message data field entry for data, the contents of the message upper address field for the upper 32 bits of address, and the contents of the message address field entry for the lower 32 bits of address. a memory read transaction from the address targeted by the msi-x message produces undefined results. msi-x table entries and pending bits are each numbered 0 through n-1, where n-1 is indicated by the table size field in the msi-x message control register. for a given arbitrary msi-x table entry k, its starting address can be calculated with the formula: entry starting addre ss = table base + k*16 for the associated pending bit k, its address for qword access and bit number within that qword can be calculated with the formulas: qword address = pba base + (k div 64)*8 qword bit# = k mod 64 software that chooses to read pending bit k with dword accesses can use these formulas: dword address = pba base + (k div 32)*4 dword bit# = k mod 32 field bits default type description table offset 31:3 0x000 ro used as an offset from the address contained by one of the function?s base address registers to point to the base of the msi-x table. the lower three table bir bits are masked off (set to zero) by software to form a 32-bit qword-aligned offset. table bir 2:0 0x3 ro indicates which one of a function?s bars, located beginning at 0x10 in configuration space, is used to map the function?s msi-x table into memory space. a bir value of three indicates that the table is mapped in bar 3. field bits default type description pba offset 31:3 0x400 ro used as an offset from the address contained by one of the function?s bars to point to the base of the msi-x pba. the lower three pba bir bits are masked off (set to zero) by software to form a 32-bit qword-aligned offset. pba bir 2:0 0x3 ro indicates which one of a function?s bars, located beginning at 0x10 in configurat ion space, is used to map the function?s msi-x pba into memory space. a bir value of three indicates that the pba is mapped in bar 3.
259 programing interface?82574 gbe controller 9.1.6 pcie configuration registers pcie provides two mechanisms to support native features: ? pcie defines a pcie capability pointer indicating support for pcie. ? pcie extends the configuration space beyond the 256 bytes available for pci to 4096 bytes. initialization values of the configuratio n registers are marked in parenthesis. 9.1.6.1 pcie capability structure the 82574 implements the pcie capability stru cture for end-point devices as listed in ta b l e 7 4 : table 74. pcie configuration registers 9.1.6.1.1 capability id, offset 0xe0, (ro) this field equals 0x10 indicating the linke d list item as being the pcie capabilities registers. 9.1.6.1.2 next pointer, offset 0xe1, (ro) offset to the next capability item in the ca pability list. a value of 0xa0 points to the msi-x capability. 9.1.6.1.3 pci express cap, offset 0xe2, (ro) the pcie capabilities register identifies pcie device type and associated capabilities. this is a read-only register. byte offset byte 3 byte 2 byte 1 byte 0 0xe0 pcie capability register next pointer capability id 0xe4 device capability 0xe8 device status device control 0xec link capability 0xf0 link status link control bits default r/w description 3:0 0001b ro capability version indicates the pcie capability structure version number 1. 7:4 0000b ro device/port type indicates the type of pcie functions. lan function in the 82574 is a native pcie functions with a value of 0000b. 80b ro slot implemented the 82574 does not implement slot options therefore this field is hardwired to 0b. 13:9 00000b ro interrupt message number the 82574 does not implement multiple ms i per function, therefore this field is hardwired to 0x0. 15:14 00b ro reserved
82574 gbe controller?programing interface 260 9.1.6.1.4 device cap, offset 0xe4, (ro) this register identifies the pcie device spec ific capabilities. it is a read-only register. 9.1.6.1.5 device control, offset 0xe8, (rw) this register controls pcie specific parameters. bits r/w default description 2:0 ro 001b max payload size supported this field indicates the maximum payload that the device can support for tlps. it is loaded from th e nvm pcie init configurat ion 3 word 0x1a (bit 8) with a default value of 256 bytes. 4:3 ro 00b phantom function supported not supported by the 82574. 5ro0b extended tag field supported max supported size of the tag field. the 82574 supports a 5-bit tag field. 8:6 ro 011b end-point l0s acceptable latency this field indicates the acceptable la tency that the 82574 can withstand due to the transition from l0s state to the l0 state. the value is loaded from the nvm pcie init configuration 1 word 0x18. 11:9 ro 110b end-point l1 acceptable latency this field indicates the acceptable la tency that the 82574 can withstand due to the transition from l1 state to the l0 state. the value is loaded from the nvm pcie init configuration 1 word 0x18. 12 ro 0b attention button present hardwired in the 82574 to 0b. 13 ro 0b attention indicator present hardwired in the 82574 to 0b. 14 ro 0b power indicator present hardwired in the 82574 to 0b. 15 ro 1b role based error reporting hardwired in the 82574 to 1b. 17:16 ro 00b reserved, set to 00b 25:18 ro 0x0 slot power limit value used in upstream ports only. hardwired in the 82574 to 0x00. 27:26 ro 00b slot power limit scale used in upstream ports only. hardwired in the 82574 to 0b. 31:28 ro 0000b reserved bits r/w default description 0rw0b correctable error reporting enable enable error report. 1rw0b non-fatal error reporting enable enable error report. 2rw0b fatal error reporting enable enable error report. 3rw0b unsupported request reporting enable enable error report. 4rw1b enable relaxed ordering if this bit is set, the device is permitted to set the relaxed ordering bit in the attribute field of write transactions that do not need strong ordering. for more details, also see register ctrl_ext bit ro_dis.
261 programing interface?82574 gbe controller 9.1.6.1.6 pcie device status, offset 0xea, (ro) this register provides information ab out pcie device specific parameters. . 7:5 rw 000b (128 bytes) max payload size this field sets maximum tlp payload si ze for the device functions. as a receiver, the device must handle tlps as large as the set value. as a transmitter, the device must not ge nerate tlps exceeding the set value. the maximum payload size supported in the device capabilities register indicates permissible values that can be programmed. 8rw0b extended tag field enable not implemented in the 82574. 9rw0b phantom functions enable not implemented in the 82574. 10 ro 0b auxiliary power pm enable when set, enables the device to draw aux power independent of pme aux power. in the 82574, this bit is hardwired to 0b. 11 rw 1b enable no snoop snoop is gated by nonsnoop bits in the gcr register in the csr space. 14:12 rw 010b max read request size this field sets maximum read request si ze for the device as a requester. the default value is 010b (512 bytes). this maximum read request configuratio n value should not be altered on the fly. 15 ro 0b reserved. bits r/w default description bits r/w default description 0rw1c0b correctable detected indicates status of correctable error detection. 1rw1c0b non-fatal error detected indicates status of non-fatal error detection. 2rw1c0b fatal error detected indicates status of fatal error detection. 3rw1c0b unsupported request detected indicates that the 82574 received an unsupported request. 4ro0b aux power detected if aux power is detected, this field is se t to 1b. it is a strapping signal from the periphery. reset on internal power on reset and pcie power good only. 5ro0b transaction pending indicates whether the 82574 has any transactions pending. (transactions include completions for any outstanding non-posted request for all used traffic classes.). 15:6 ro 0x00 reserved
82574 gbe controller?programing interface 262 9.1.6.1.7 link cap, offset 0xec, (ro) this register identifies pcie link-specific capabilities. this is a read-only register. bits r/w default description 3:0 ro 0001b max link speed the 82574 indicates a maximum link speed of 2.5 gb/s. 9:4 ro 0x01 max link width indicates the maximum link width. the 82574 supports x1 lane link. defined encoding: 000001b x1. all other values - reserved. 11:10 ro 11b active state link pm support indicates the level of active state power management supported in the 82574. defined encodings are: 00b = reserved 01b = l0s entry supported. 10b = reserved. 11b = l0s and l1 supported. this field is loaded from the nvm pc ie init configuration 3 word 0x1a. 14:12 ro 001b (64- 128 ns) l0s exit latency indicates the exit latency from l0s to l0 state. this field is loaded from the nvm pcie init configuration 1 word 0x18 (two values for common pcie clock or separate pcie clock. 000b = less than 64 ns. 001b = 64 ns ? 128 ns. 010b = 128 ns ? 256 ns. 011b = 256 ns - 512 ns. 100b = 512 ns - 1 ? s. 101b = 1 ? s ? 2 ? s. 110b = 2 ? s ? 4 ? s. 111b = reserved. if the 82574 uses a common clock - pcie init config 1 bits [2:0], if the 82574 uses a separate clock - pcie init config 1 bits [5:3]. 17:15 ro 110b (32-64 ? s) l1 exit latency indicates the exit latency from l1 to l0 state. this field is loaded from the nvm pcie init configuration 1 word 0x18. 000b = less than 1 ? s. 001b = 1 ? s - 2 ? s. 010b = 2 ? s - 4 ? s. 011b = 4 ? s - 8 ? s. 100b = 8 ? s - 16 ? s. 101b = 16 ? s - 32 ? s. 110b = 32 ? s - 64 ? s. 111b = l1 transition not supported. 18 ro 0b reserved. 19 ro 0b surprise down error reporting capable. 20 ro 0b data link layer link active reporting capable. 23:21 ro 000b reserved. 31:24 hwinit 0x0 port number the pcie port number for the given pcie link. field is set in the link training phase.
263 programing interface?82574 gbe controller 9.1.6.1.8 link control, offset 0xf0, (ro) this register controls pcie link specific parameters. 9.1.6.1.9 link status, offset 0xf2, (ro) this register provides information about pcie link-specific parameters. this is a read- only register. bits r/r default description 1:0 rw 00b active state link pm control this field controls the active stat e pm supported on the link. defined encodings are: 00b = pm disabled. 01b = l0s entry supported. 10b = reserved. 11b = l0s and l1 supported. 2 ro 0b reserved. 3 rw 0b read completion boundary. 4ro0b link disable not applicable for end-point devices, hardwired to 0b. 5ro0b retrain clock not applicable for end-point devices, hardwired to 0b. 6rw0b common clock configuration when set, indicates that the 82574 and th e component at the other end of the link are operating with a common reference clock. a value of 0b indicates that they operate with an asynchronous clock. this parameter affects the l0s exit latencies. 7rw0b extended sync this bit, when set, forces extended tx of fts ordered set in fts and extra ts1 at exit from l0s prior to enter l0. 15:8 ro 0x0 reserved. bits r/w default description 3:0 ro 0001b link speed indicates the negotiated link speed. 0001b is the onl y defined speed, which is 2.5 gb/s. 9:4 ro 000001b negotiated link width indicates the negotiated width of the link. relevant encoding for the 82574 is: 000001b x1 10 ro 0b link training error indicates that a link training error has occurred. 11 ro 0b link training indicates that link training is in progress. 12 hwinit 1b slot clock configuration when set, indicates that the 82574 uses the physical reference clock that the platform provides on the connector. th is bit must be cleared if the 82574 uses an independent clock. slot clock configuration bit is loaded from the slot_clock_cfg nvm bit. 15:13 ro 0000b reserved
82574 gbe controller?programing interface 264 9.1.6.2 pcie extended configuration space pcie configuration space is located in a flat memory-mapped address space. pcie extends the configuration space beyond the 256 bytes available for pci to 4096 bytes. the 82574 decodes additional 4-bits (bits 27:24) to provide the additional configuration space as shown. pcie reserves the remaining 4 bits (bits 31:28) for future expansion of the configuration space beyond 4096 bytes. the configuration address for a pcie device is computed using pci-compatible bus, device and function numbers as follows: pcie extended configuration space is allocate d using a linked list of optional or required pcie extended capabilities following a form at resembling pci capability structures. the first pcie extended capability is located at offset 0x100 in the device configuration space. the first dword of the capability stru cture identifies the capability/version and points to the next capability. the 82574 supports the following pcie extended capabilities: ? advanced error reporting capability - offset 0x100 ? device serial number capability - offset 0x140 9.1.6.2.1 advanced error reporting capability the pcie advanced error reporting capability is an optional extended capability to support advanced error reporting. the follo wing table lists the pcie advanced error reporting extended capability structure for pcie devices. 31 28 27 20 19 15 14 12 11 2 1 0 0000b bus # device # fun # register address (offset) 00b register offset field description 0x00 pcie cap id pcie extended capability id. 0x04 uncorrectable error status reports error status of individual uncorrectable error sources on a pcie device. 0x08 uncorrectable error mask controls reporting of individual uncorrectable errors by device to the host bridge via a pcie error message. 0x0c uncorrectable error severity controls whether an individual uncorrectable error is reported as a fatal error. 0x10 correctable error status reports error status of individual correctable error sources on a pcie device. 0x14 correctable error mask controls reporting of individual correctable errors by device to the host bridge via a pcie error message. 0x18 first error pointer identifies the bit position of the firs t uncorrectable error reported in the uncorrectable error status register. 0x1c:0x28 header log captures the header for the transaction that generated an error.
265 programing interface?82574 gbe controller 9.1.6.2.1.1 pci express cap id, offset 0x00 9.1.6.2.1.2 uncorrectable error status, offset 0x04 the uncorrectable error status register report s error status of individual uncorrectable error sources on a pcie device. a value of 1b at a specific bit location indicates the source of the error according to the followi ng table. software might clear an error status by writing a 1b to the respective bit. . bit location attribute default value description 15;0 ro 0x0001 extended capability id pcie extended capability id indicating advanced error reporting capability. 19:16 ro 0x1 version number pcie advanced error reportin g extended capability version number. 31:20 ro 0x000/0x140 next capability pointer - next pcie extended capability pointer. if serial number capability is enabled in nvm (pcie init configuration 2 word), the default value is 0x140. otherwise, it?s 0x000 indicating the end of capabilities list. bit location attribute default value description 3:0 ro 0b reserved. 4 r/w1cs 0b data link protocol error status. 11:5 ro 0b reserved. 12 r/w1cs 0b poisoned tlp status. 13 r/w1cs 0b flow control protocol error status. 14 r/w1cs 0b completion timeout status. 15 r/w1cs 0b completion abort status. 16 r/w1cs 0b unexpected completion status. 17 r/w1cs 0b receiver overflow status. 18 r/w1cs 0b malformed tlp status. 19 ro 0b reserved. 20 r/w1cs 0b unsupported request error status. 31:21 ro 0b reserved.
82574 gbe controller?programing interface 266 9.1.6.2.1.3 uncorrectable error mask, offset 0x08 the uncorrectable error mask register contro ls reporting of individual uncorrectable errors by device to the host bridge via a pcie error message. a masked error (respective bit set in mask register) is not re ported to the host bridge by an individual device. there is a mask bit per bit of the uncorrectable error status register. 9.1.6.2.1.4 uncorrectable error severity, offset 0x0c the uncorrectable error severity register co ntrols whether an individual uncorrectable error is reported as a fatal error. an uncorre ctable error is reported as fatal when the corresponding error bit in the severity regi ster is set. if the bit is cleared, the corresponding error is considered non-fatal. bit location attribute default value description 3:0 ro 0b reserved. 4 rws 0b data link protocol error mask. 11:5 ro 0b reserved. 12 rws 0b poisoned tlp mask. 13 rws 0b flow control protocol error mask. 14 rws 0b completion timeout mask. 15 rws 0b completion abort mask. 16 rws 0b unexpected completion mask. 17 rws 0b receiver overflow mask. 18 rws 0b malformed tlp mask. 19 ro 0b reserved. 20 rws 0b unsupported request error mask. 31:21 ro 0b reserved. bit location attribute default value description 3:0 ro 0b reserved. 4 rws 1b data link protocol error severity. 11:5 ro 0b reserved. 12 rws 0b poisoned tlp severity. 13 rws 1b flow control protocol error severity. 14 rws 0b completion timeout severity. 15 rws 0b completion abort severity. 16 rws 0b unexpected completion severity. 17 rws 1b receiver overflow severity. 18 rws 1b malformed tlp severity. 19 ro 0b reserved. 20 rws 0b unsupported request error severity. 31:21 ro 0b reserved.
267 programing interface?82574 gbe controller 9.1.6.2.1.5 correctable error status, offset 0x10 the correctable error status register reports error status of individual correctable error sources on a pcie device. when an individual error status bit is set to 1b it indicates that a particular error occurred. software might clear an error status by writing a 1b to the respective bit. 9.1.6.2.1.6 correctable error mask, offset 0x14 the correctable error mask register controls reporting of individual correctable errors by device to the host bridge via a pcie error message. a masked error (respective bit set in mask register) is not reported to the ho st bridge by an individual device. there is a mask bit per bit in the correctable error status register. 9.1.6.2.1.7 first error pointer, offset 0x18 the first error pointer is a read-only register that identifies the bit position of the first uncorrectable error reported in the uncorrectable error status register. bit location attribute default value description 0 r/w1cs 0b receiver error status. 5:1 ro 0b reserved. 6 r/w1cs 0b bad tlp status. 7 r/w1cs 0b bad dllp status. 8 r/w1cs 0b replay_num rollover status. 11:9 ro 0b reserved. 12 r/w1cs 0b replay timer timeout status. 13 r/w1cs 0b advisory non fatal error status. 15:14 ro 0b reserved. bit location attribute default value description 0 rws 0b receiver error mask. 5:1 ro 0b reserved. 6rws0b bad tlp mask. 7 rws 0b bad dllp mask. 8 rws 0b replay_num rollover mask. 11:9 ro 0b reserved. 12 rws 0b replay timer timeout mask. 13 rws 1b advisory non fatal error mask. 15:14 ro 0b reserved. bit location attribute default value description 3:0 ro 0b vector pointing to the first recorded error in the uncorrectable error status register.
82574 gbe controller?programing interface 268 9.1.6.2.1.8 header log, offset 0x1c the header log register captures the header for the transaction that generated an error. this register is 16 bytes. 9.1.6.2.2 device serial number capability the pcie device serial number capability is an optional extended capability that can be implemented by any pcie device. the device serial number is a read-only 64-bit value that is unique for a given pcie device. all multi-function devices that implement this capability must implement it for function 0; other functions that implement this capa bility must return the same device serial number value as that reported by function 0. the 82574 is not a multi-function device. table 75. pcie device serial number capability structure 9.1.6.2.2.1 device serial number enhanced capability header (offset 0x00) figure 58 details the allocation of register fiel ds in the device se rial number enhanced capability header. the table below provides the respective bit definitions. the extended capability id for the device serial number capability is 0003h. figure 58. allocation of register fields in the device serial number enhanced capability header bit location attribute default value description 127:0 ro 0x0 header of the defective packet (tlp or dllp). 31 0 pcie enhanced capability header serial number register (lower dw) serial number register (upper dw) 31 20 19 16 15 0 next capability offset capability version pci express extended capability id bit(s) location attributes description 15:0 ro pcie extended capability id this field is a pci-sig defined id number that indicates the nature and format of the extended capability. extended capability id for the device serial number capability is 0x0003. 19:16 ro capability version this field is a pci-sig defined version number that indicates the version of the capability structure present. must be 0x1 for this version of the specification. 31:20 ro next capability offset this field contains the offset to the next pcie capability structure or 0x000 if no other items exist in the linked list of capabilities. for extended capabilities impl emented in device configuration space, this offset is relative to the beginning of pci compat ible configuration space and thus must always be either 0x000 (for terminating lis t of capabilities) or greater than 0x0ff.
269 programing interface?82574 gbe controller 9.1.6.2.2.2 serial number register (offset 0x04) the serial number register is a 64-bit fi eld that contains the ieee defined 64-bit extended unique identifier (eui-64?). figure 59 details the allocation of register fields in the serial number register. the following table lists the respective bit definitions. figure 59. serial number register 9.1.6.2.2.3 serial number definition in the 82574 the serial number can be constructed from the 48-bit mac address in the following form: figure 60. serial number definition in the 82574 48-bit mac address the mac label in the 82574 is 0xffff. for example, assume that the company id is (intel) 00-a0-c9 and the extension identifier is 23-45-67. in this case, the 64-bit serial number is: the mac address is the function 0 mac addr ess as loaded from nvm into the ral and rah registers. the official doc defining eui-64 is: http://standards.ieee.org/regauth/oui/tutorials/ eui64.html 31 0 serial number register (lower dw) serial number register (upper dw) 63 32 bit(s) location attributes description 63:0 ro pcie device serial number this field contains the ieee defined 64-bi t extended unique identifier (eui-64?). this identifier includes a 24-bit compan y id value assigned by ieee registration authority and a 40-bit extension identifier assigned by the manufacturer. field company id mac la bel extension identifier order addr+0 addr+1 addr+2 addr+3 addr+4 addr+5 addr+6 addr+7 most significant bytes least significant byte most significant bit least significant bit field company id mac label extension identifier order addr+0 addr+1 addr+2 addr+ 3 addr+4 addr+5 addr+6 addr+7 00 a0 c9 ff ff 23 45 67 most significant byte least significant byte most significant bit least significant bit
82574 gbe controller?driver programing interface 270 10.0 driver programing interface 10.1 introduction this chapter details the programmer visible state inside the 82574. in some cases, it describes hardware structures invisible to software in order to clarify a concept. the 82574's address space is mapped into four regions. these regions are listed in ta b l e 7 6 : table 76. 82574 address space both the flash and expansion rom base address registers (bars) map the same flash memory. the internal registers, memories, and flash can be accessed though i/o space indirectly, as explained in the sections that follow. 10.1.1 memory and i/o address decoding 10.1.1.1 memory-mapped access to internal registers and memories the internal registers and memories can be accessed as direct memory-mapped offsets from the base address register 0 (bar0). the appropriate offset for each specific internal register is described in this section. 10.1.1.2 memory-mapped access to flash the external flash can be accessed using direct memory-mapped offsets from the flash base address register 1 (bar1). the flash is only accessible if enabled through the nvm initialization control word, and if the flash bar1 contains a valid (non-zero) base memory address. for accesses, the offset from the flash bar1 corresponds to the offset into the flash actual physical memory space. 10.1.1.3 memory-mapped access to msi-x tables the msi-x tables can be accessed as direct memory-mapped offsets from the base address register 3 (bar3). the appropriate offset for each specific internal register is described in this section. addressable content how mapped size of region internal registers and memories direct memory mapped 128 kb flash (optional) direct memory-mapped 64 kb-16 mb expansion rom (optional) direct memory-mapped 2 kb-256 kb internal registers and memories, flash (optional) i/o window mapped 32 bytes msi-x (optional) direct memory mapped 16 kb
271 driver programing interface?82574 gbe controller 10.1.1.4 memory-mapped access to expansion rom the external flash can also be accessed as a memory-mapped expansion rom. accesses to offsets starting from the expansion rom bar reference the flash, provided that access is enabled through the nvm init ialization control word, and the expansion rom bar contains a valid (non-zero) base memory address. 10.1.1.5 i/o-mapped access to intern al registers, memories, and flash to support pre-boot operation (prior to the allocation of physical memory base addresses), all internal registers, memories, and flash can be accessed using i/o operations. i/o accesses are supported only if: ? an i/o base address register (bar) is allocated and mapped (bar2) ? the bar contains a valid (non-zero) value ? i/o address decoding is enabled in the pcie configuration when an i/o bar is mapped, the i/o address range allocated opens a 32-byte window in the system i/o address map. within this window, two i/o addressable registers are implemented: ?ioaddr ?iodata the ioaddr register is used to specify a re ference to an internal register, memory, or flash, and then the iodata register is used as a window to the register, memory or flash address specified by ioaddr: 10.1.1.5.1 ioaddr (i/o offset 0x00) the ioaddr register must always be written as a dword access. writes that are less than 32 bits are ignored. reads of any size return a dword of data. however, the chipset or cpu might only return a subset of that dword. for software programmers, the in and out instructions must be used to cause i/o cycles to be used on the pcie bus. because writes must be to a 32-bit quantity, the source register of the out instruction must be eax (the only 32-bit register supported by the out command). for reads, the in inst ruction can have any size target register, but it is recommended that the 32-bit eax register be used. because only a particular range is addressable, the upper bits of this register are hard coded to zero. bits 31 through 20 cannot be written to and always read back as 0b. at hardware reset (internal power on reset) or pci reset, this register value resets to 0x00000000. once written, the value is retained until the next write or reset. offset abbreviation name r/ w size 0x00 ioaddr internal register, internal memory, or flash location address. 0x00000-0x1ffff ? internal registers and memories. 0x20000-0x7ffff ? undefined. 0x80000-0xfffff ? flash. r/w 4 bytes 0x04 iodata data field for reads or writes to the internal register, internal memory, or flash location as identified by the current value in ioa ddr. all 32 bits of this register are read/write-able. r/w 4 bytes 0x08 ? 0x1f reserved reserved ro 4 bytes
82574 gbe controller?driver programing interface 272 10.1.1.5.2 iodata (i /o offset 0x04) the iodata register must always be written as a dword access when the ioaddr register contains a value for the internal register and memories (such as, 0x00000- 0x1fffc). in this case, writes that are less than 32 bits are ignored. the iodata register may be written as a byte, word, or dword access when the ioaddr register contains a value for the flash (such as, 0x80000-0xfffff). in this case, the value in ioaddr must be properly aligned to the data value. the following table lists the supported configurations: note: software might have to implement non-obvious code to access the flash, a byte, or word at a time. example code that reads a flash byte is shown here to illustrate the impact of the previous table: char *ioaddr; char *iodata; ioaddr = iobase + 0; iodata = iobase + 4; *(ioaddr) = flash_byte_address; read_data = *(iodata + (flash_byte_address % 4)); reads to iodata of any size return a dword of data. however, the chipset or cpu might only return a subset of that dword. for software programmers, the in and out instructions must be used to cause i/o cycles to be used on the pcie bus. where 32-bit quantities are required on writes, the source register of the out instruction must be eax (the only 32-bit register supported by the out command). writes and reads to iodata when the ioaddr register value is in an undefined range (0x20000-0x7fffc) should not be perfor med. results cannot be determined. note: there are no special software timing requirements on accesses to ioaddr or iodata. all accesses are immediate except when data is not readily available or acceptable. in this case, the 82574 delays the results thro ugh normal bus methods (for example, split transaction or transaction retry). note: because a register/memory/flash read or write takes two i/o cycles to complete, software must provide a guarantee that the two i/o cycles occur as an atomic operation. otherwise, results can be non- deterministic from the software viewpoint. access type 82574 ioaddr register bits [1:0] target iodata access be[3:0]# bits in data phase byte (8 bit) 00b 1110b 01b 1101b 10b 1011b 11b 0111b word (16 bit) 00b 1100b 10b 0011b dword (32 bit) 00b 0000b
273 driver programing interface?82574 gbe controller 10.1.1.5.3 undefined i/o offsets i/o offsets 0x08 through 0x1f are considered to be reserved offsets with the i/o window. dword reads from these addresses return 0xffff; writes to these addresses are discarded. 10.1.2 registers byte ordering this section defines the structure of registers that contain fields carried over the network. some examples are l2, l3, l4 fields. the following example is used to describe by te ordering over the wire (hex notation): where each byte is sent with the least signif icant bit (lsb) first. that is, the bit order over the wire for this example is the general rule for register ordering is to use host ordering. using the previous example, a 6-byte fields (such as, mac addr ess) is stored in a csr in the following manner: the following exceptions use network ordering. using the previous example, a 16-bit field (such as, ethertype) is stored in a csr in the following manner: the following exception uses network ordering: ? all ethertype fields note: the normal notation as it appears in text books, etc. is to use network ordering. example: suppose a mac address of 00-a0-c 9-00-00-00. the order on the network is 00, then a0, then c9, etc. however, the host ordering presentation is: last first ..., 06 05 04 03 02 01 00 last first .... 0000 0011 0000 0010 0000 0001 0000 0000 byte 3 byte 2 byte 1 byte 0 dw address (n) 0x03 0x02 0x01 0x00 dw address (n+4) 0x05 0x04 byte 3 byte 2 byte 1 byte 0 (dw aligned) ... ... 0x01 0x00 or (word aligned) 0x00 0x01 ... ... byte 3 byte 2 byte 1 byte 0 dword address (n) 00 c9 a0 00 dword address (n+4) ... ... 00 00
82574 gbe controller?driver programing interface 274 10.1.3 register conventions all registers in the 82574 are defined to be 32 bits. they should be accessed as 32-bit double-words. there are some exceptions to this rule: ? register pairs where two 32-bit registers make up a larger logical size. ? accesses to flash memory (via expansion rom space, secondary bar space, or the i/o space) can be byte, word or double word accesses. reserved bit positions: some registers contain certain bits that are marked as reserved. reads from registers containing reserved bits might return indeterminate values in the reserved bit-positions unless read values are explicitly stated. when read, these reserved bits should be ignored by software. reserved and/or un defined addresses: any register address not explicitly declared in this specification should be considered to be reserved, and should not be written to. note: writing to reserved or undefined register addresses can cause indeterminate behavior. reads from reserved or undefined config uration register addresses might return indeterminate values unless read values are explicitly stated for specific addresses. initial values: most registers define the initial hardware values prior to being programmed. in some cases, hardware init ial values are undefined and are listed as such via the text undefined, unknown, or x. some of these configuration values should be set via nvm configuration or via software in order to insure proper operation. this need is dependent on the function of the bit. other registers might cite a hardware default which is overridden by a higher-precedence operation. operations that might supersede hardware defaults can include: ? a valid nvm load ? completion of a hardware operation (such as hardware auto-negotiation) ? writing of a different register whose value is then reflected in another bit for registers that should be accessed as 32-bit double words, partial writes (less than a 32-bit double word) does not take effect (such as, the write is ignored). partial reads return all 32 bits of data regardless of the byte enables. note: partial reads to clear-by-read registers (such as, icr) can have unexpected results since all 32 bits are actually read regardle ss of the byte enables. partial reads should not be done. note: all statistics registers are implemented as 32-bit registers. though some logical statistics registers represent counters in excess of 32-bits in width, registers must be accessed using 32-bit operations (such as, independent access to each 32-bit field). see special notes for vlan filter table and multicast table arrays in their specific register definitions. 10.2 configuration and status registers - csr space 10.2.1 register summary table all registers are listed in section 77 . these registers are ordered by grouping and are not necessarily listed in the order that they appear in the address space.
275 driver programing interface?82574 gbe controller table 77. 82574 register summary category offset alias offset abbreviation name rw link to page general 0x00000 / 0x00004 n/a ctrl device control register rw page 281 general 0x00008 n/a status device status register r page 284 general 0x00010 n/a eec eeprom/flash control register rw/ ro page 285 general 0x00014 n/a eerd eeprom read register rw page 287 general 0x00018 n/a ctrl_ext extended device control register rw page 287 general 0x0001c n/a fla flash access register rw page 289 general 0x00020 n/a mdic mdi control register rw page 290 general 0x00028 n/a fcal flow control address low rw page 292 general 0x0002c n/a fcah flow control address high rw page 292 general 0x00030 n/a fct flow control type rw page 293 general 0x00038 n/a vet vlan ether type rw page 293 general 0x00170 n/a fcttv flow control transmit timer value rw page 293 general 0x05f40 n/a fcrtv flow control refresh threshold value rw page 294 general 0x00e00 n/a ledctl led control rw page 294 general 0x00f00 n/a extcnf_ctrl extended configuration control rw page 296 general 0x00f08 n/a extcnf_size extended configuration size rw page 296 general 0x01000 n/a pba packet buffer allocation rw page 297 general 0x1010 n/a eemngctl mng eeprom control register ro page 297 general 0x1014 n/a eemngdata mng eeprom read/write data ro page 298 general 0x1018 n/a flmngctl mng flash control register ro page 298 general 0x101c n/a flmngdata mng flash read data ro page 298 general 0x1020 n/a flmngcnt mng flash read counter ro page 298 general 0x01028 n/a flasht flash timer register rw page 298 general 0x0102c n/a eewr eeprom write register rw page 299 general 0x1030 n/a flswctl sw flash burst control register rw page 299 general 0x1034 n/a flswdata sw flash burst data register rw page 300 general 0x1038 n/a flswcnt sw flash burst access counter rw page 300 general 0x0103c n/a flop flash opcode register rw page 300 general 0x1050 n/a flol fleep auto load rw page 300 pcie 0x05b00 n/a gcr 3gio control register rw page 300 pcie 0x05b08 n/a functag function?tag register rw page 302 pcie 0x05b10 n/a gscl_1 3gio statistic control register #1 rw page 302 pcie 0x05b14 n/a gscl_2 3gio statistic control registers #2 rw page 303 pcie 0x05b18 n/a gscl_3 3gio statistic control register #3 rw page 303 pcie 0x05b1c n/a gscl_4 3gio statistic control register #4 rw page 303 pcie 0x05b20 n/a gscn_0 3gio statistic counter registers #0 rw page 303 pcie 0x05b24 n/a gscn_1 3gio statistic counter registers #1 rw page 303
82574 gbe controller?driver programing interface 276 pcie 0x05b28 n/a gscn_2 3gio statistic counter registers #2 rw page 303 pcie 0x05b2c n/a gscn_3 3gio statistic counter registers #3 rw page 304 pcie 0x05b50 n/a swsm software semaphore register rw page 304 pcie 0x05b64 n/a gcr2 3gio control register 2 rw page 304 pcie 0x5b68 n/a pbaclr msi?x pba clear rw1c page 304 interrupt 0x000c0 n/a icr interrupt cause read register rc/ wc page 308 interrupt 0x000c4 n/a itr interrupt throttling register r/w page 310 interrupt 0x000e8 + 4 *n[n = 0..4] n/a eitr extended interrupt throttle r/w page 310 interrupt 0x000c8 n/a ics interrupt cause set register w page 311 interrupt 0x000d0 n/a ims interrupt mask set/read register rw page 312 interrupt 0x000d8 n/a imc interrupt mask clear register w page 313 interrupt 0x000dc n/a eiac interrupt auto clear rw page 314 interrupt 0x000e0 n/a iam interru pt acknowledge auto?mask rw page 314 interrupt 0x000e4 n/a ivar interrupt vector allocation registers rw page 314 receive 0x00100 n/a rctl receive control register rw page 315 receive 0x02170 n/a psrctl packet split receive control register rw page 318 receive 0x02160 0x00168 fcrtl flow control receive threshold low rw page 319 receive 0x02168 0x00160 fcrth flow control receive threshold high rw page 319 receive 0x02800 0x00110 rdbal0 receive descriptor base address low queue 0 rw page 320 receive 0x02804 0x00114 rdbah0 receive descriptor base address high queue 0 rw page 320 receive 0x02808 0x00118 rdlen0 receive descriptor length queue 0 rw page 320 receive 0x02810 0x00120 rdh0 receive descriptor head queue 0 rw page 321 receive 0x02818 0x00128 rdt0 receive descriptor tail queue 0 rw page 321 receive 0x02820 0x00108 rdtr rx interrupt delay timer [packet timer] rw page 321 receive 0x02828 n/a rxdctl receive descriptor control rw page 322 receive 0x0282c n/a radv receive interrupt absolute delay timer rw page 323 receive 0x02c00 n/a rsrpd receive small packet detect interrupt r/w page 324 receive 0x02c08 n/a raid receive ack interrupt delay register rw page 324 receive 0x05000 n/a rxcsum receive checksum control rw page 324 receive 0x05008 n/a rfctl receive filter control register rw page 326 receive 0x5010 n/a mavtv0 management vlan tag value 0 rw page 326 receive 0x5014 n/a mavtv1 management vlan tag value 1 rw page 327 receive 0x5018 n/a mavtv2 management vlan tag value 2 rw page 327 receive 0x501c n/a mavtv3 management vlan tag value 3 rw page 327 receive 0x05200- 0x053fc mta[127:0] multicast table array rw page 327 category offset alias offset abbreviation name rw link to page
277 driver programing interface?82574 gbe controller receive 0x05400 0x00040 ral(0) receive address low (0) rw page 328 receive 0x05404 0x00044 rah(0) receive address high (0) rw page 328 receive 0x05408 0x00048 ral(1) receive address low (1) rw page 328 receive 0x0540c 0x0004c rah(1) receive address high (1) rw page 328 receive 0x05600- 0x057fc 0x00600- 0x007fc vfta[127:0] vlan filter table array rw page 329 receive 0x05600- 0x057fc 0x00600- 0x006fc vfta[127:0] vlan filter table array (n) rw page 329 receive 0x05478 0x000b8 ral(15) receive address low (15) rw page 328 receive 0x0547c x000bc rah(15) receive address high (15) rw page 328 receive 0x05818 n/a mrqc multiple receive queues command register rw page 330 receive 0x05c00- 0x05c7f n/a reta redirection table rw page 330 receive 0x05c80- 0x05ca7 n/a rssrk rss random key register rw page 331 transmit 0x00400 n/a tctl transmit control register rw page 332 transmit 0x00410 n/a tipg transmit ipg register rw page 333 transmit 0x00458 n/a ait adaptive ifs throttle rw page 334 transmit 0x03800 0x00420 tdbal transmit descriptor base address low rw page 334 transmit 0x03804 0x00424 tdbah transmit descriptor base address high rw page 335 transmit 0x03808 0x00428 tdlen transmit descriptor length rw page 335 transmit 0x03810 0x00430 tdh transmit descriptor head rw page 335 transmit 0x03818 0x00438 tdt transmit descriptor tail rw page 336 transmit 0x03840 n/a tarc transmit arbitration count rw page 336 transmit 0x03820 0x00440 tidv transmit interrupt delay value rw page 337 transmit 0x03828 n/a txdctl transmit descriptor control rw page 338 transmit 0x0382c n/a tadv transmit absolute interrupt delay value rw page 339 statistic 0x04000 n/a crcerrs crc error count r page 340 statistic 0x04004 n/a algnerrc alignment error count r page 340 statistic 0x0400c n/a rxerrc rx error count r page 341 statistic 0x04010 n/a mpc missed packets count r page 341 statistic 0x04014 n/a scc single collision count r page 341 statistic 0x04018 n/a ecol excessive collisions count r page 341 statistic 0x0401c n/a mcc multiple collision count r page 342 statistic 0x04020 n/a latecol late collisions count r page 342 statistic 0x04028 n/a colc collision count r page 342 statistic 0x04030 n/a dc defer count r page 342 statistic 0x04034 n/a tncrs transmit with no crs r page 343 statistic 0x0403c n/a cexterr carrier extension error count r page 343 category offset alias offset abbreviation name rw link to page
82574 gbe controller?driver programing interface 278 statistic 0x04040 n/a rlec receive length error count r page 343 statistic 0x04048 n/a xonrxc xon received count r page 344 statistic 0x0404c n/a xontxc xon transmitted count r page 344 statistic 0x04050 n/a xoffrxc xoff received count r page 344 statistic 0x04054 n/a xofftxc xoff transmitted count r page 344 statistic 0x04058 n/a fcruc fc received unsupported count rw page 344 statistic 0x0405c n/a prc64 packets received [64 bytes] count rw page 345 statistic 0x04060 n/a prc127 packets received [65?127 bytes] count rw page 345 statistic 0x04064 n/a prc255 packets received [128?255 bytes] count rw page 345 statistic 0x04068 n/a prc511 packets received [256?511 bytes] count rw page 345 statistic 0x0406c n/a prc1023 packets received [512?1023 bytes] count rw page 346 statistic 0x04070 n/a prc1522 packets received [1024 to max bytes] count rw page 346 statistic 0x04074 n/a gprc good packets received count r page 346 statistic 0x04078 n/a bprc broadcast packets received count r page 347 statistic 0x0407c n/a mprc multicast packets received count r page 347 statistic 0x04080 n/a gptc good packets transmitted count r page 347 statistic 0x04088 n/a gorcl good octets received count low r page 347 statistic 0x0408c n/a gorch good octets received count high r page 347 statistic 0x04090 n/a gotcl good octets transmitted count low r page 348 statistic 0x04094 n/a gotch good octets transmitted count high r page 348 statistic 0x040a0 n/a rnbc receive no buffers count r page 348 statistic 0x040a4 n/a ruc receive undersize count r page 348 statistic 0x040a8 n/a rfc receive fragment count r page 349 statistic 0x040ac n/a roc receive oversize count r page 349 statistic 0x040b0 n/a rjc receive jabber count r page 349 statistic 0x040b4 n/a mngprc management packets received count r page 349 statistic 0x040b8 n/a mpdc manage ment packets dropped count r page 350 statistic 0x040bc n/a mptc management packets transmitted count r page 350 statistic 0x040c0 n/a torl total octets received r page 350 statistic 0x040c4 n/a torh total octets received r page 350 statistic 0x040c8 n/a tot total octets transmitted rw page 351 statistic 0x040d0 n/a tpr total packets received rw page 351 statistic 0x040d4 n/a tpt total packets transmitted rw page 351 statistic 0x040d8 n/a ptc64 packets transmitted [64 bytes] count rw page 352 statistic 0x040dc n/a ptc127 packets transmitted [65?127 bytes] count rw page 352 statistic 0x040e0 n/a ptc255 packets transmitted [128?255 bytes] count rw page 352 statistic 0x040e4 n/a ptc511 packets transmitted [256?511 bytes] count rw page 353 category offset alias offset abbreviation name rw link to page
279 driver programing interface?82574 gbe controller statistic 0x040e8 n/a ptc1023 packets transmitted [512?1023 bytes] count rw page 353 statistic 0x040ec n/a ptc1522 packets transmitted [greater than 1024 bytes] count rw page 353 statistic 0x040f0 n/a mptc multicast packets transmitted count rw page 353 statistic 0x040f4 n/a bptc broadcast packets transmitted count rw page 354 statistic 0x040f8 n/a tsctc tcp segmentation context transmitted count rw page 354 statistic 0x040fc n/a tsctfc tcp segmentation context transmit fail count rw page 354 statistic 0x04100 n/a iac interrupt assertion count r page 354 management 0x05800 n/a wuc wake up control register rw page 355 management 0x05808 n/a wufc wake up filter control register rw page 356 management 0x05810 n/a wus wake up status register rw page 356 management 0x05828 n/a mfutp01 management flex udp/tcp ports 0/1 rw page 357 management 0x05830 n/a mfutp23 management flex udp/tcp port 2/3 rw page 357 management 0x5838 n/a ipav ip address valid rw page 357 management 0x05840? 0x05858 n/a ip4at ipv4 address table rw page 358 management 0x05820 n/a manc management control register rw page 358 management 0x5860 n/a manc2h management control to host register rw page 359 management 0x5824 n/a mfval manageability filters valid rw page 360 management 0x5890 + 4*n [n=0..7] n/a mdef manageability decision filters rw page 360 management 0x05880? 0x0588f n/a ip6at ipv6 address table rw page 361 management 0x05a00- 0x05a7c n/a wupm wake up packet memory [128 bytes] r page 362 management 0x05b30 n/a factps function active and power state to mng ro page 362 management 0x05f00? 0x05f28 n/a fflt flexible filter length table rw page 362 management 0x09000? 0x093f8 n/a ffmt flexible filter mask table rw page 363 management 0x09400? 0x097f8 n/a ftft flexible tco filter table rw page 363 management 0x09800? 0x09bf8 n/a ffvt flexible filter value table rw page 364 time sync offset 0b620 n/a tsyncrxctl rx time sync control register rw page 365 time sync offset 0b628 n/a rxstmph rx timestamp high rw page 365 time sync offset 0b624 n/a rxstmpl rx timestamp low rw page 365 time sync offset 0b62c n/a rxsatrl rx timestamp attributes low rw page 365 category offset alias offset abbreviation name rw link to page
82574 gbe controller?driver programing interface 280 time sync offset 0x0b630 n/a rxsatrh rx timestamp attributes high rw page 366 time sync offset 0b634 n/a rxcfgl rx ethertype and message type register rw page 366 time sync offset 0x0b638 n/a rxudp rx udp port rw page 366 time sync offset 0b614 n/a tsynctxctl tx time sync control register rw page 366 time sync offset 0b618 n/a txstmpl tx timestamp value low rw page 367 time sync offset 0b61c n/a txstmph tx timestamp value high rw page 367 time sync offset 0b600 n/a systiml system time register low rw page 367 time sync offset 0b604 n/a systimh system time register high rw page 367 time sync offset 0b608 n/a timinca increment attributes register rw page 367 time sync offset 0b60c n/a timadjl time adjustment offset register low rw page 367 time sync offset 0b610 n/a timadjh time adjustment offset register high rw page 368 msi-x bar3: 0x0000 + n*0x10 [n=0..4] n/a msixtadd msi-x table entry lower address r/w page 369 msi-x bar3: 0x0004 + n*0x10 [n=0..4] n/a msixtuadd msi-x table entry upper address r/w page 369 msi-x bar3: 0x0008 + n*0x10 [n=0..4] n/a msixtmsg msi-x table entry message r/w page 369 msi-x bar3: 0x000c + n*0x10 [n=0..4] n/a msixtvctrl msi-x table entry vector control r/w page 369 msi-x bar3: 0x02000 n/a msixpba msi-x pba bit description ro page 370 diagnostic 0x00f10 n/a poemb phy oem bits register rw page 399 diagnostic 0x02410 0x08000 rdfh receive data fifo head register rw page 399 diagnostic 0x02418 0x08008 rdft receive data fifo tail register rw page 400 diagnostic 0x02420 n/a rdfhs receive da ta fifo head saved register rw page 400 diagnostic 0x02428 n/a rdfts receive data fifo tail saved register rw page 400 diagnostic 0x02430 n/a rdfpc receive data fifo packet count rw page 401 diagnostic 0x03410 0x08010 tdfh transmit data fifo head register rw page 401 diagnostic 0x03418 0x08018 tdft transmit data fifo tail register rw page 401 diagnostic 0x03420 n/a tdfhs transmit data fifo head saved register rw page 402 category offset alias offset abbreviation name rw link to page
281 driver programing interface?82574 gbe controller note: certain registers maintain an alias addre ss designed for backward compatibility with software written for previous devices. for these registers, the alias address is shown in ta b l e 7 7 . those registers can be accessed by software at either the new offset or the alias offset. it is recommended that software written solely for the 82574, use the new address offset. 10.2.2 general regi ster descriptions 10.2.2.1 device control register - ctrl (0x00000 / 0x00004; rw) diagnostic 0x03428 n/a tdfts transmit data fifo tail saved register rw page 402 diagnostic 0x03430 n/a tdfpc transmit data fifo packet count rw page 402 diagnostic 0x10000 - 0x17fff n/a pbm packet buffer memory rw page 402 diagnostic 0x01008 n/a pbs packet buffer size rw page 403 field bit(s) initial value description fd 0 1b 1 full duplex 0b = half duplex 1b = full duplex. controls the mac du plex setting when explicitly set by software. reserved 1 0b reserved write as 0b for future compatibility. gio master disable 20b when set, the 82574 blocks ne w master requests, including manageability requests, by this func tion. once no master requests are pending by this function, the gio master enable status bit is set. reserved 3 1b reserved set to 1b. reserved 4 0b reserved write as 0b for future compatibility. asde 5 0b 1 auto-speed detection enable when set to 1b, the mac ignores th e speed indicated by the phy and attempts to automatically detect the resolved speed of the link and configure itself appropriately. this bit must be set to 0b in the 82574. slu 6 0b 1 set link up the set link up bit must be set to 1b to permit the mac to recognize the link signal from the phy, wh ich indicates the phy has gotten the link up, and to receive and transmit data. see section 3.2.3 for more information about auto-negotiation and link configuration in the various modes. set link up is normally initia lized to 0b. however, if the apm enable bit is set in the nvm then it is initialized to 1b. reserved 7 0b reserved. must be set to 0b. category offset alias offset abbreviation name rw link to page
82574 gbe controller?driver programing interface 282 speed 9:8 10b speed selection these bits can determine the speed configuration and are written by software after reading the phy configuration through the mdio interface. these signals are ignored when auto-speed detection is enabled. see section 3.2.1 for details. 00b = 10 mb/s 01b = 100 mb/s 10b = 1000 mb/s 11b =not used reserved 10 0b reserved write as 0b for future compatibility. frcspd 11 0b 1 force speed this bit is set when software wants to manually configure the mac speed settings according to the speed bits. when using a phy device, note that the phy device must resolve to the same speed configuration, or software must manually set it to the same speed as the mac. note that this bit is su perseded by the ctrl_ext.spd_byps bit which has a similar function. frcdplx 12 0b force duplex when set to 1b, software might ove rride the duplex indication from the phy that is indicated in the fdx to the mac. otherwise, the duplex setting is sampled from the phy fdx indication into the mac on the asserting edge of the phy link signal. when asserted, the ctrl.fd bit sets duplex. reserved 19:13 0x0 reserved reads as 0b. advd3wuc 20 1b d3cold wakeup capability advertisement enable when set, d3cold wakeup capability is advertised based on whether the aux_pwr advertises presence of auxiliary power (yes if aux_pwr is indicated, no otherw ise). when 0b, however, d3cold wakeup capability is not advertis ed even if aux_pwr presence is indicated. note: this bit must be set to 1b. reserved 25:21 0x0 reserved rst 26 0b device reset this bit performs a reset of the mac function of the device, as described in section 10.2.2.2 . normally 0b; writin g 1b initiates the reset. this bit is self-clearing. rfce 27 0b receive flow control enable indicates that the device responds to the reception of flow control packets. reception of flow control packets requires the correct loading of the fcal/h and fct registers. if auto-negotiation is enabled, this bit is set to the negotiated duplex value. see section 3.2.3 for more information about auto-negotiation. tfce 28 0b transmit flow control enable indicates that the device transmits flow control packets (xon and xoff frames) based on receiver fulln ess. if auto-negotiation is enabled, this bit is set to th e negotiated duplex value. see section 3.2.3 for more information about auto-negotiation. reserved 29 0b reserved reads as 0b. vme 30 0b vlan mode enable when set to 1b, all packets transmitted from the 82574 that have vle set is sent with an 802.1q header added to the packet. the contents of the header come from the transm it descriptor and from the vlan type register. on receive, vlan information is stripped from 802.1q packets. see section 7.5.1 for more details. field bit(s) initial value description
283 driver programing interface?82574 gbe controller this register, as well as the extended device control (ctrl_ext) register, controls the major operational modes for the device. while a software write to this register to control device settings, several bits (such as fd and speed ) might be overridden depending on other bit settings and the resu ltant link configuration determined by the phy's auto-negotiation resolution. see section 3.2.3 for a detailed explanation on the link configuration process. note: in half-duplex mode, the 82574 transmits carrier extended packets and can receive both carrier extended packets and packets transmitted with bursting. when using an internal phy, the fd (duplex) and speed configuration of the device is normally determined from the link configur ation process. software can specifically override/set these mac settings via these bits in a forced-link scenario; if so, the values used to configure the mac must be consistent with the phy settings. manual link configuration is controlled through the phy's mii management interface. the advd3wuc bit (advertise d3cold wakeup capability enable control) enables the aux_pwr pin to determine whether d3cold support is advertised. if full 1 gb/s operation in d3 state is desired but the system's power requirements in this mode would exceed the d3cold wakeup-enabled spec ification limit (375 ma at 3.3 v dc), this bit can be used to prevent the capability from being advertised to the system. when using the internal phy, by default the phy re-negotiates the lowest functional link speed in d3 and d0u states. the phyreg 25.2 bit enables this capability to be disabled, in case full 1 gb/s speed is desired in these states. note: the 82574 internal phy automatically dete cts an unplugged lan cable and reduce operational power to the minimal amount re quired to maintain system operation. controller operations are not affected, except for the inability to transmit/receive due to the lost link. device reset (rst) might be used to globally reset the entire component. this register is provided primarily as a last-ditch software mechanism to recover from an indeterminate or suspected hung hardware st ate. most registers (receive, transmit, interrupt, statistics, etc.), and state machines are set to their power-on reset values, approximating the state following a power-on or pci reset. however, pcie configuration registers are not reset, thereby leaving the device mapped into system memory space and accessible by a software device driver. one internal configuration register, the packet buffer allocation (pba) register, also retains its value through a global reset. note: to ensure that global device reset has fu lly completed and that the 82574 responds to subsequent accesses, designers must wait approximately 1 ? s after resetting before attempting to check to see if the bit has cleared or attempting to access (read or write) any other device register. before issuing this reset, software has to insure that tx and rx processes are stopped by following the procedure described in section 3.1.3.10 . phy_rst 31 0b phy reset controls a hardware-level reset to the internal phy. 0b = normal (operational). 1b = reset to phy asserted. 1. these bits are read from the nvm. field bit(s) initial value description
82574 gbe controller?driver programing interface 284 10.2.2.2 device status register - status (0x00008; r) fd reflects the actual mac duplex configuration. this normally reflects the duplex setting for the entire link, as it normally reflects the duplex configuration negotiated between the phy and link partner (copper lin k) or mac and link partner (fiber link). link up provides a useful indication of whether something is attached to the port. successful negotiation of features/link parameters results in link activity. the link start- up process (and consequently the duration fo r this activity after reset) can be several 100's of ? s. it reflects whether the phy's link indication is present. refer to section 3.2.3 for more details. txoff indicates the state of the transmit function when symmetrical flow control has been enabled and negotiated with the link partner. this bit is set to 1b when transmission is paused due to the reception of an xoff frame. it is cleared upon expiration of the pause timer or the receipt of an xon frame. field bit(s) initial value description fd 0 x full duplex 0b = half duplex 1b = full duplex. reflects duplex setting of the mac and/or link. lu 1 x link up 0b = no link established 1b = link established. fo r this to be valid, the set link up bit of the device control (ctrl.su) register must be set. reserved 3:2 00b reserved txoff 4 x transmission paused indication of pause state of the transmit function when symmetrical flow control is enabled. reserved 5 0b reserved speed 7:6 x link speed setting. reflects speed setting of the mac and/or link 00b = 10 mb/s 01b = 100 mb/s 10b = 1000 mb/s 11b = 1000 mb/s asdv 9:8 x auto-speed detection value speed result sensed by the mac auto-detection function. phyra 10 1b phy reset asserted this bit is read/write. hardware sets this bit following the assertion of phy reset. the bit is cleared on writ ing 0b to it. this bit is used by firmware as an indication for re quired initializat ion of the phy. reserved 18:11 0x0 reserved gio master enable status 19 1b cleared by the 82574 when the gio master disable bit is set and no master requests are pending by this function. set otherwise. indicates that no master requests is is sued by this function as long as the gio master disable bit is set. reserved 30:20 0x0 reserved reads as 0b. reserved 31 0b reserved
285 driver programing interface?82574 gbe controller speed indicates the actual mac speed configur ation. these bits normally reflect the speed of the actual link, negotiated by the phy and link partner, and reflected internally from the phy to the mac (spd_ind). these bits might represent the speed configuration of the mac only, if the mac sp eed setting has been forced via software (ctrl.speed) or mac auto-speed detection used. speed indications are mapped as follows: 00b = 10 mb/s 01b = 100 mb/s 10b = 1000 mb/s 11b = 1000 mb/s if auto-speed detection is enabled, the device's speed is configured only once after the link signal is asserted by the phy. the asdv bits are provided for diagnostics purposes only. even if the mac speed configuration is not set using this function ( asde =0b), the asd calculation can be initiated by software writing a logic one to the ctrl_ext.asdchk bit. the resultant speed detection is reflected in these bits. 10.2.2.3 eeprom/flash control register - eec (0x00010; rw/ro) field bit(s) initial value description ee_sk 0 0b clock input to the nvm when ee_gnt is 1b, the ee_sk output signal is mapped to this bit and provides the serial clock input to the nvm. software clocks the nvm via toggling this bit with successive writes. ee_cs 1 0b chip select input to the nvm when ee_gnt is 1b, the ee_cs outpu t signal is mapped to the chip select of the nvm device. software enables the nvm by writing a 1b to this bit. ee_di 2 0b data input to the nvm when ee_gnt is 1b, the ee_di outpu t signal is mapped directly to this bit. software provides data inpu t to the nvm via writes to this bit. ee_do 3 x data output bit from the nvm the ee_do input signal is mapped directly to this bit in the register and contains the nvm data output. this bit is read-only from the software perspective ? writes to this bit have no effect. fwe 5:4 01b flash write enable control these two bits control whether writes to the flash are allowed. 00b = enable flash erase and block erase. 01b = flash writes and flash erase disabled. 10b = flash writes enabled. 11b = not allowed. this field enables write and erase instructions from software to the flash via the flash bar and the software dma registers ( flsw ). ee_req 6 0b request nvm access software must write a 1b to this bit to get direct nvm access. it has access when ee_gnt is 1b. when so ftware completes the access it must write a 0b. ee_gnt 7 0b grant nvm access when this bit is set to 1b, software can access the nvm using the sk, cs, di, and do bits.
82574 gbe controller?driver programing interface 286 this register provides software direct access to the nvm. software can control the nvm by successive writes to this register. data and address information is clocked into the eeprom by software toggling the ee_sk bit of this register with ee_cs set to 1. data ee_pres 8 x nvm present setting this bit to 1b indicates th at an nvm (either flash or eeprom) is present and has the correct signat ure field. this bit is read only. auto_rd 9 0b nvm auto read done when set to 1b, this bit indicates that the auto read by hardware from the nvm is done. this bit is set al so when the nvm is not present or when its signature is not valid. this field is read only. reserved 10 0b reserved nvsize 14:11 0010b 1 nvm size this field defines the size of the nvm: this field defines the size of the nvm in bytes which equal 128 * 2 ** nvsize. this field is loaded from word 0x0f in the nvm. this field is read only. nvadds 16:15 00b nvm address size this field defines the address size of the nvm: 00b = reserved. 01b = eeprom with 1 address byte. 10b = eeprom with 2 address bytes. 11b = flash with 3 address bytes. this field is set at power up by the nvmt strapping pin. with the eeprom, the address length is se t following a detection of the signature bits in word 0x12. if an eeprom is attached to the 82574 and a valid signature is not found, software can modify this field enabling parallel access to empty device. in all other cases writes to this field do not affect the device operation reserved 17 0b reserved reserved 18 0b reserved reserved 19 0b reserved aupden 20 0b enable autonomous flash update 1b = enables the 82574 to update the flash autonomously. the autonomous update is triggered by write cycles and expiration of the flasht timer. 0b = disables the auto-update logic. reserved 21 0b reserved sec1val 22 0b sector 1 valid in case ee_pres is set, a 0b indica tes that s0 in the flash contains valid signatures. 1b indicates that s1 contains valid signatures. in eeprom setup or if ee_pres is not set, the sec1val is 0b. nvmtype 23 0b 2 this is a read-only field indicating the nvm type: 0b = eeprom. 1b = flash. this bit is loaded from nvm word 0x0f and is informational only (the design uses strapping to determine the actual nvm type). reserved 24 0b reserved reserved 25 0b reserved reserved 31:26 0x0 reserved reads as 0b. 1. these bits are read from the nvm. field bit(s) initial value description
287 driver programing interface?82574 gbe controller output from the nvm is latched into bit 3 of this register via the internal 62.5 mhz clock and may be accessed by software via reads of this register. see section 3.3.8 for details. note: attempts to write to the flash device when writes are disabled (fwe=01) should not be attempted. behavior after such an operatio n is undefined, and can result in component and/or system hangs. 10.2.2.4 eeprom read register - eerd (0x00014; rw) this register is used by software to cause the 82574 to read individual words in the eeprom. to read a word, software writes the address to the read address field and simultaneously writes a 1b to the start read field. the 82574 reads the word from the eeprom and places it in the read data field, setting the read done field to 1b. software can poll this register, looking for a 1b in the read done field, and then using the value in the read data field. note: when this register is used to read a word from the eeprom, that word is not written to any of the 82574's internal registers even if it is normally a hardware accessed word. 10.2.2.5 extended device control register - ctrl_ext (0x00018; rw) field bit(s) initial value description start 0 0b start read writing a 1b to this bit causes the 82574 to read a 16-bit word at the address stored in the addr field from the nvm. the result is stored in the data field. this bit is self-clearing done 1 1b read done set to 1b when the word read comple tes. set to 0b when the read is in progress. writes by software are ignored. addr 15:2 0x0 read address this field is written by software along with start read to indicate the word address of the word to read. data 31:16 0x0 read data data returned from the nvm. field bit(s) initial value description reserved 11:0 0x0 reserved. asdchk 12 0b asd (auto speed detection) check initiate an asd sequence to sense the frequency of the rx_clk signal from the phy. the results are reflected in status.asdv. this bit is self-clearing. ee_rst 13 0b eeprom reset initiates a reset-like event to the eeprom function. this causes the eeprom to be read as if a pci_rst_n assertion had occurred. note: all device functions should be di sabled prior to setting this bit. this bit is self-clearing. reserved 14 0b 1 reserved should be set to 0b.
82574 gbe controller?driver programing interface 288 spd_byps 15 0b speed select bypass when set to 1b, all speed detect ion mechanisms are bypassed and the device is immediately set to the speed indicated by ctrl.speed. this provides a method for software to have full control of the speed settings of the device as well as when the change takes place by overriding the hardware clock switching circuitry. reserved 16 0b 1 reserved should be set to 0b. ro_dis 17 0b relaxed ordering disable when set to 1b, the device does not request any relaxed ordering transactions regardless of the state of bit 4 (enable relaxed ordering) in the pcie device control register. when this bit is cleared and bit 4 of the pcie device control register is set, the device requests relaxed ordering transactions as described in section 3.1.3.8.2 . reserved 18 0b reserved dma dynamic gating enable 19 0b 1 when set, this bit enables dynami c clock gating of the dma and mac units. phy power down enable 20 1b 1 when set, this bit enables the phy to enter a low-power state. reserved 21 0b 1 reserved tx ls flow 22 0b 1 should be set for correct tso functionality. refer to section 7.3 . tx ls 23 0b 1 should be cleared for correct tso functionality. refer to section 7.3 . eiame 24 0b extended interrupt auto mask enable when set (usually in msi-x mode), upon firing of an msi-x message, bits set in iam associated with this message are cleared. otherwise, eiam is used only upon a read of the eicr register. reserved 26:25 00b reserved iame 27 0b when the iame (interrupt acknowledge auto-mask enable) bit is set, a read or write to the icr register has the side effect of writing the value in the iam register to the imc register. when this bit is 0b, the feature is disabled. drv_load 28 0b driver loaded this bit should be set by the software device driver after it was loaded, cleared when the software device driver unloads or pcie soft reset. the management controller (mc) loads this bit to indicate that the software device driver has been loaded. int_timers_ clear_ena 29 0b when set, this bit enables the cl earing of the interrupt timers following an ims clear. in this state, successive interrupts occur only after the timers expire again. when cleared, successive interrupts following ims clear might happen immediately. reserved 30 0b reserved reads as 0b. pba_supportr 31 0b pba support when set, setting one of the extend ed interrupt masks via ims causes the pba bit of the associated msi-x vector to be cleared. otherwise, the 82574 behaves in a way supporting legacy int-x interrupts. should be cleared when working in int-x or msi mode and set in msi- x mode. 1. these bits are read from the nvm. field bit(s) initial value description
289 driver programing interface?82574 gbe controller this register provides extended control of de vice functionality beyond that provided by the device control (ctrl) register. note: device control register values are changed by a read of the eeprom which occurs upon assertion of the ee_rst bit. therefore, if software uses the ee_rst function and desires to retain current configuration inform ation, the contents of the control registers should be read and stored by software. note: the eeprom reset function might read conf iguration information out of the eeprom which affects the configuration of pcie co nfiguration space bar settings. the changes to the bars are not visible unless the system is rebooted and the bios is allowed to re- map them. note: the spd_byps bit performs a similar function to the ctrl.frcspd bit in that the device's speed settings are determined by the value software writes to the crtl.speed bits. however, with the spd_byps bit asserted, the settings in ctrl.speed take effect rather than waiting until after the device's clock switching circuitry performs the change. 10.2.2.6 flash access register - fla (0x0001c; rw) field bit(s) initial value description fl_nvm_sk 0 0b clock input to the flash when fl_gnt is 1, the fl_nvm_sk output signal is mapped to this bit and provides th e serial clock input to the flash. software clocks the flash via toggling this bit with successive writes. fl_ce 1 0b chip select input to the flash when fl_gnt is 1, the fl_ce outpu t signal is mapped to the chip select of the flash device. software enables the flash by writing a 0 to this bit. fl_si 2 0b data input to the flash when fl_gnt is 1, the fl_si outpu t signal is mapped directly to this bit. software provides data input to the flash via writes to this bit. fl_so 1 3x data output bit from the flash the fl_so input signal is mapped directly to this bit in the register and contains the flash serial data output. this bit is read- only from the software perspective ? writes to this bit have no effect. fl_req 4 0b request flash access the software must write a 1 to this bit to get direct flash access. it has access when fl_gnt is 1. when the software completes the access it must write a 0. fl_gnt 5 0b grant flash access when this bit is set to 1b, the software can access the flash using the sk, cs, di, and do bits. fl_dev_er_ind 6 0b status bit indicates manageability initiated a device erase transaction to the flash. fl_sec_er_ind 7 0b status bit indicates manageability initiated a sector erase transaction to the flash. fl_wr_ind 8 0b status bit indicates manageability initiated a write transaction to the flash. sw_wr_done 9 1b status bit indicates that last lan_bar or lan_exp write was done.
82574 gbe controller?driver programing interface 290 note: this register provides the software with direct access to the flash. software can control the flash by successive writes to this register. data and address information is clocked into the flash by software toggling the fl_n vm_sk bit (0) of this register with fl_ce set to 1. data output from the flash is latche d into bit 3 of this register via the internal 125 mhz clock and may be accessed by software via reads of this register. note: in the 82574, the fla register is only rese t at internal power on reset and not as legacy devices at a software reset. 10.2.2.7 mdi control register - mdic (0x00020; rw) this register is used by software to read or write management data interface (mdi) registers in a gmii/mii phy. reserved 10 1b reserved reserved 29:11 0x0 reserved reads as 0b. fl_busy 30 0b flash busy this bit is set to 1b while a transaction to the flash is in progress. while this bit is clear (read as 0b), software can access the flash. this field is read only. fl_er 31 0b flash erase command the command is sent to the flas h only if bits 5:4 in the eec register are set to 00b. this bit is auto-cleared and read as 0b. certain flash vendors do not support this operation. field bit(s) initial value description data 15:0 x data in a write command, software places the data bits and the mac shifts them out to the phy. in a read command, the mac reads these bits serially from the phy and software can read them from this location. regadd 20:16 0x0 phy register addr ess; i.e., reg 0, 1, 2, ? 31. phyadd 25:21 0x0 phy address 1 = gigabit phy. 2 = pcie phy. op 27:26 0x0 op-code 01b = mdi write. 10b = mdi read. other values are reserved. r281b ready bit set to 1b by the 82574 at the end of the mdi transaction (for example, indicates a read or write has been completed). it should be reset to 0b by software at the same time the command is written. i290b interrupt enable when set to 1b by software, it causes an interrupt to be asserted to indicate the end of an mdi cycle. e300b error this bit set is to 1b by hardware when it fails to complete an mdi read. software should make sure this bit is clear (0b) before making an mdi read or write command. reserved 31 0b reserved. write as 0b for future compatibility. field bit(s) initial value description
291 driver programing interface?82574 gbe controller for an mdi read cycle the sequence of events is as follows: 1. the cpu performs a pcie write cycle to the mii register with: a. ready = 0b. b. interrupt enable bit set to 1b or 0b. c. op-code = 10b (read). d. phyadd = phy address from the mdi register. e. regadd = register address of the specif ic register to be accessed (0 through 31). 2. the mac applies the following sequence on the mdio signal to the phy: <01><10> where the z stands for the mac tri-stating the mdio signal. 3. the phy returns the following sequence on the mdio signal: <0>. 4. the mac discards the leading bit and places the following 16 data bits in the mii register. 5. the 82574 asserts an interrupt indicating mdi done if the interrupt enable bit was set. 6. the 82574 sets the ready bit in the mii register indi cating the read is complete. 7. the cpu might read the data from the mii register and issue a new mdi command. for an mdi write cycle, the sequence of events is as follows: 1. the cpu performs a pcie write cycle to the mii register with: a. ready = 0b. b. interrupt enable bit set to 1b or 0b. c. op-code = 01b (write). d. phyadd = phy address from the mdi register. e. regadd = register address of the specif ic register to be accessed (0 through 31). f. data = specific data for desired control of the phy. 2. the mac applies the following sequence on the mdio signal to the phy: <01><01><10>. 3. the 82574 asserts an interrupt indicating mdi done if the interrupt enable bit was set. 4. the 82574 sets the ready bit in the mii register to indicate step 2 has been completed. 5. the cpu might issue a new mdi command. note: an mdi read or write might take as long as 64 ? s from the cpu write to the ready bit assertion. if an invalid op-code is written by software, the mac does not execute any accesses to the phy registers. if the phy does not generate a zero as th e second bit of the turn-around cycle for reads, the mac aborts the access, sets the e (error) bit, writes 0xffff to the data field to indicate an error condition, and sets the ready bit.
82574 gbe controller?driver programing interface 292 10.2.2.8 flow control address low - fcal (0x00028; rw) flow control packets are defined by 802.3x to be either a unique multicast address or the station address with the ethertype field indicating pause. hardware compares incoming packets against the fca register value to determine if it should pause its output. this register contains the lower bits of the internal 48-bit flow control ethernet address. all 32 bits are valid. software can access the high and low registers as a register pair if it can perform a 64-bit access to the pcie bus. this register should be programmed with 0x00_c2_80_01. the complete flow control multicast address is: 0x01_80_c2_00_00_01; where 01 is the first by te on the wire, 80 is the second, etc. note: any packet matching the contents of {fcah, fcal, fct} when ctrl.rfce is set is acted on by the 82574. whether flow control packets are passed to the host (software) depends on the state of the rctl.dpf bit and whether the packet matches any of the normal filters. 10.2.2.9 flow control address high - fcah (0x0002c; rw) this register contains the upper bits of the 48-bit flow control ethernet address. only the lower 16 bits of this register have meaning. the complete flow control address is {fcah, fcal}. this register should be pr ogrammed with 0x01_00. the complete flow control multicast address is: 0x01_80_c2_00_00_ 01; where 01 is the first byte on the wire, 80 is the second, etc. note: at the time of the original implementation, the flow control multicast address was not defined and thus hardware provided programmability. since then, the final release of the 802.3x standard has reserved the following multicast address for mac control frames: 0x01-80-c2-00-00-01. field bit(s) initial value description fcal 31:0 x flow control address low field bit(s) initial value description fcah 15:0 x flow control address high reserved 31:16 0x0 reserved reads as 0x0.
293 driver programing interface?82574 gbe controller 10.2.2.10 flow control type - fct (0x00030; rw) this register contains the type field hardware uses to recognize a flow control packet. only the lower 16 bits of this register have meaning. this register should be programmed with 0x88_08. the upper byte is first on the wire fct[15:8]. note: at the time of the original implementation, the flow control type field was not defined and thus hardware provided programmability. since then, the final release of the 802.3x standard has specified the type/length value for mac control frames as 88-08. 10.2.2.11 vlan ether type - vet (0x00038; rw) this register contains the type field hardware uses to recognize an 802.1q (vlan) ethernet packet. to be compliant with the 802.3ac standard, this register should be programmed with the value 0x8100. for vlan transmission the upper byte is first on the wire (vet[15:8]). 10.2.2.12 flow control transmit ti mer value - fcttv (0x00170; rw) the 16-bit value in the ttv field is inserted into a transmitted frame (either xoff frames or any pause frame value in any software transmitted packets). it counts in units of slot time. if software needs to send an xon frame, it must set ttv to 0b prior to initiating the pause frame. note: the 82574 uses a fixed slot time value of 64-byte times. field bit(s) initial value description fct 15:0 x flow control type reserved 31:16 0x0 reserved reads as 0x0 field bit(s) initial value description vet 15:0 0x8100 vlan ether type reserved 31:16 0x0 reserved reads as 0x0. field bit(s) initial value description ttv 15:0 x transmit timer value included in xoff frame. reserved 31:16 0x0 reads as 0x0. should be writte n to 0x0 for futu re compatibility.
82574 gbe controller?driver programing interface 294 10.2.2.13 flow control refresh threshold value - fcrtv (0x05f40; rw) 10.2.2.14 led control - ledctl (0x00e00; rw) bit type reset description 15:0 rw x flow control refresh threshold (fcrt) this value indicates the threshold value of the flow control shadow counter. when the counter reaches this value, and the conditions for a pause state are still valid (buffer fu llness above low th reshold value), a pause (xoff) frame is sent to the link partner. the fcrtv timer count interval is the same as other flow control timers and counts at slot times of 64-byte times. if this field contains a zero value, the flow control refresh is disabled. 31:16 ro 0x0 reserved reads as 0x0. should be written to 0x0 for future compatibility. field bit(s) initial value description led0_mode 3:0 0010b 1 led0 (link_up_n) mode this field specifies the control sour ce for the led0 ou tput. an initial value of 0010b selects link_up indication. reserved 4 0b reserved read-only as 0b. write as 0b for future compatibility. global_ blink_mode 50b 1 global blink mode this field specifies the blink mode of all leds. 0b = blink at 200 ms on and 200 ms off. 1b = blink at 83 ms on and 83 ms off. led0_ivrt 6 0b 1 led0 (link_up_n) invert this field specifies the polarity/ in version of the led source prior to output or blink control. 0b = do not invert led source. 1b = invert led source. led0_blink 7 0b 1 led0 (link_up_n) blink this field specifies whether to appl y blink logic to the (inverted) led control source prior to the led output. 0b = do not blink asserted led output. 1b = blink asserted led output. led1_mode 11:8 0011b 1 led1 (activity_n) mode this field specifies the control sour ce for the led1 ou tput. an initial value of 0011b selects activity indication. reserved 12 0b reserved read-only as 0b. write as 0 for future compatibility. led1_blink_ mode 13 0b 1 led1 (activity_n) blink mode this field needs to be configured with the same value as global_blink_mode, it specifie s the blink mode of the led. 0b = blink at 200 ms on and 200 ms off. 1b = blink at 83 ms on and 83 ms off. led1_ivrt 14 0b 1 led1 (activity_n) invert. led1_blink 15 1b 1 led1 (activity_n) blink
295 driver programing interface?82574 gbe controller the following mapping is used to specify the led control source (mode) for each led output: led2_mode 19:16 0110b 1 led2 (link_100_n) mode this field specifies the control sour ce for the led2 output. an initial value of 0110b selects link_100 indication. reserved 20 0b reserved read-only as 0b. write as 0b for future compatibility. led2_blink_ mode 21 0b 1 led2 (link_100_n) blink mode this field needs to be config ured with the same value as global_blink_mode, it specifie s the blink mode of the led. 0b = blink at 200 ms on and 200 ms off. 1b = blink at 83 ms on and 83 ms off. led2_ivrt 22 0b 1 led2 (link_100_n) invert. led2_blink 23 0b 1 led2 (link_100_n) blink reserved 31:24 0x0 reserved 1. these bits are read from the nvm. field bit(s) initial value description mode selected mode source indication 0000 link_10/1000 asserted when either 10 or 1000 mb/s link is established and maintained. 0001 link_100/1000 asserted when either 100 or 1000 mb/s link is established and maintained. 0010 link_up asserted when any speed link is established and maintained. 0011 filter_activity asserted when link is established and packets are being transmitted or received that passed mac filtering. 0100 link/activity asserted when link is estab lished and when there is no transmit or receive activity. 0101 link_10 asserted when a 10 mb/s link is established and maintained. 0110 link_100 asserted when a 100 mb/s link is established and maintained. 0111 link_1000 asserted when a 1000 mb/s link is established and maintained. 1000 reserved reserved 1001 full_duplex asserted when the link is configured for full-duplex operation. 1010 collision asserted when a collision is observed. 1011 activity asserted when link is established and packets are being transmitted or received. 1100 bus_size asserted when the device detects a 1-lane pcie connection. 1101 paused asserted when the device?s transmitter is flow controlled. 1110 led_on always asserted. 1111 led_off always de-asserted.
82574 gbe controller?driver programing interface 296 notes: 1. when led blink mode is enabled the appropriate led invert bit should be set to zero. 2. the dynamic leds modes (filter_ac tivity, link/activity, collision, activity, paused) should be used with led blink mode enabled. 3. when led blink mode is enabled and ccm pll is shut, the blinking frequencies are 1/5 of the rates stated in the previous table. 10.2.2.15 extended configuration co ntrol - extcnf_ctrl (0x00f00; rw) 10.2.2.16 extended configuration size - extcnf_size (0x00f08; rw) field bit(s) initial value description reserved 31:28 0b reserved reserved 27:16 0x0 reserved reserved 15:8 0x0 reserved reserved 7 0b reserved reserved 6 0b reserved reserved 5 0b reserved reserved 4 0b reserved reserved 3 1b reserved reserved 2 0b reserved reserved 1 0b reserved reserved 0 0b should be set to 0b. field bit(s) initial value description reserved 31:8 0x0 reserved reserved 7:0 0x0 reserved
297 driver programing interface?82574 gbe controller 10.2.2.17 packet buffer allocation - pba (0x01000; rw) this register sets the on-chip receive and transmit storage allocation ratio. the receive allocation value is read/write for the lower 6 bits. the transmit allocation is read only and is calculated based on rxa. the partitioning size is 1 kb. note: programming this register does not automatically re-load or initialize internal packet- buffer ram pointers. software must reset both transmit and receive operation (using the global device reset ctrl.rst bit) after changing this register in order for it to take effect. the pba register itself is not reset by asserting the global reset, but is only reset upon initial hardware power on. note: for best performance the transmit buffer allocation should be set to accept two full sized packets. note: transmit packet buffer size should be configured to be more than 4 kb. 10.2.2.18 mng eeprom control register - eemngctl (0x1010; ro) note: this register is read/write by fi rmware and read only by software. field bit(s) initial value description rxa 15:0 0x0014 receive packet buffer allocation in kb. upper 10 bits are read only as 0x0. default is 20 kb. txa 31:16 0x0014 transmit packet buffer allocation in kb. these bits are read only. default is 20 kb. field bit(s) initial value description addr 14:0 0x0 address this field is written by manageability along with start read or start write to indicate the eeprom word address to read or write. start 15 0b start writing a 1b to this bit causes th e eeprom to start the read or write operation according to the write bit. write 16 0b write this bit tells the eeprom if the current operation is read or write. 0b = read. 1b = write. eebusy 17 0b eprom busy this bit indicates that the eepro m is busy doing an auto read. reserved 18 0b reserved ee_trans_e 19 0b transaction this bit indicates that the register is in the middle of a transaction. reserved 30:20 0x0 reserved done 31 1b transaction done this bit is cleared after the start write or the start read bit is set by manageability and is set back again when the eeprom write or read transaction completes.
82574 gbe controller?driver programing interface 298 10.2.2.19 mng eeprom read/write data - eemngdata (0x1014; ro) note: this register is read/write by firmware and read only by software. 10.2.2.20 mng flash control register - flmngctl (0x1018; ro) note: this register is read-write by fw and read-only by sw. 10.2.2.21 mng flash read data - flmngdata (0x101c; ro) note: this register is read-write by fw and read-only by sw. 10.2.2.22 mng flash read counter - flmngcnt (0x1020; ro) note: this register is read-write by fw and read-only by sw. 10.2.2.23 flash timer register- flasht (0x01028; rw) field bit(s) initial value description wrdata 15:0 0x0 write data data to be written to the eeprom. rddata 31:16 x read data data returned from the eeprom read. field bit(s) default description flt 15:0 0x2 auto flash update timer defines the idle time from the last write until the 82574 autonomously updates the flash. the time is measured in flasht.flt x 1024 cycles at 62.5 mhz (or 12.5 mhz when the 125 mhz clock is gated). a value of 0x00 means that the update is not delayed. the update timer is enabled by the aupden bit in the eec register. reserve 31:16 0x00 reserved
299 driver programing interface?82574 gbe controller 10.2.2.24 eeprom write regi ster - eewr (0x0102c; rw) note: eewr has direct access regardless of a valid signature in the nvm. 10.2.2.25 sw flash burst control re gister - flswctl (0x1030; rw) field bit(s) default description start 0 0b start write writing a 1b to this bit causes the 82574 to write a 16-bit word at the address stored in the addr field in the external nvm. the data is fetched from the data field. this bit is self-clearing. done 1 1b write done set to 1b when the write completes. set to 0b when the write is in progress. writes by software are ignored. addr 15:2 0x0 write address this field is written by software along with start write to indicate the word address of the word to read. data 31:16 0x0 write data data written to the nvm. field bit(s) default description addr 23:0 0x0 address this field is written by software along with start read or start write to indicate the flash address to read or write. cmd 25:24 00b command indicates which command should be executed. valid only when the cmdv bit is set. 00b = reserved. 01b = dma write command (write up to 256 bytes). 10b = reserved. 11b = reserved. cmdv 26 0b command valid when set, indicates that software issues a new command. cleared by hardware at the end of the command. flbusy 27 0b flash busy this bit indicates that the flash is busy processing a flash transaction and should not be accessed. reserved 28 0b reserved fludone 29 0b flash update done this bit is set by the 82574 when it completes updating the flash. software should clear it to zero before it updates the flash. done 30 1b write done this bit clears after cmdv is set by software and is set back again when the flash write transaction is done. when writing a burst transaction the bit is cleared every time software writes flswdata. wrdone 31 1b global done this bit clears after the cmdv bit is set by software and is set back again when the all flash read/wri te transactions complete. for example, the flash unit finished to read/write all the requested read/ writes.
82574 gbe controller?driver programing interface 300 10.2.2.26 software flash burst data register - flswdata (0x1034; rw) 10.2.2.27 software flash burst access counter - flswcnt (0x1038; rw) 10.2.2.28 flash opcode register - flop (0x0103c; rw) this register is used by the 82574 to initiate the appropriate instructions to the nvm device. 10.2.2.29 feep auto load - flol (0x01050; rw) 10.2.3 pcie register descriptions 10.2.3.1 3gio control register - gcr (0x05b00; rw) field bit(s) default description nvdata 31:0 0x0 write nvm data data written to the nvm. field bit(s) default description abort 31 0b abort writing a 1b to this bit aborts the current burst operation. it is self- cleared by the flash interface block when the abort command has been executed. abort request is not permitted after writing the last dword. reserved 30:25 0x0 reserved nvcnt 24:0 0x0 nvm counter this counter holds the size of the flash burst read or write in dwords and is also used as the write byte count but in this case it is byte count. field bit(s) default description ram_pwr_ save_en 01b when set to 1b, enables reduced power consumption by clock gating the 82574 rams. reserved 7:1 0x0 auto loaded from nvm 0x11 bits 7:1. reserve 31:8 0x0 reserved field bit(s) initial value description disable_ timeout_ mechanism 31 0b if set, the pcie time- out mechanism is disabled. self_test_ result 30 0b if set, a self-test result finished successfully. gio_good_l0s 29 0b force good pcie l0s training. gio_dis_rd_ err 28 0b disable running disparity error of pcie 108b decoders.
301 driver programing interface?82574 gbe controller l1_act_ without_l0s_ rx 27 0b if set, enables the device to enter aspm l1 active without any correlation to l0s_rx. l1_entry_ latency (lsb) (read only) 26:25 11b determines the idle time of the pcie link in l0s state before initiating a transition to l1 state. the initial value is loaded from nvm. 00b = 64 ? s 01b = 256 ? s 10b = 1 ms 11b = 4 ms l0s_entry_ lat 24 0b l0s entry latency set to 0b to indicate l0s entry latency is the same as l0s exit latency. set to 1b to indicate l0s entry latency is (l0s exit latency/4). l1_entry_ latency (msb) (read only) 23 1b latency 000b = 2 ? s. 001b = 8 ? s. 010b = 1 6 ? s. 011b = 32 ? s. 100b = 64 ? s. 101b = 25 6 ? s. 110b = 1 ms. 111b = 4 ms (default). reserved 22 0b reserved for proper operation, must be set to 1b by software during initialization. header_log_ order 21 0b when set, indicates a need to chan ge the order of the header log in the error reporting registers. pba_cl_deas 20 0b if cleared, pba is cleared on de-assertion of msi-x request. reserved 19:10 0x0 reserved rx_l0s_ adjustment 91b when set to 1b the reply-timer always adds the required l0s adjustment. when cleared to 0b the adjustment is added only when tx l0s is active. reserved 8:6 0b reserved txdscr_ nosnoop 50b transmit descriptor read ? no snoop indication. read directly by transaction layer. txdscw_ nosnoop 40b transmit descriptor write ? no snoop indication. read directly by transaction layer. txd_ nosnoop 30b transmit data read ? no snoop indication. read directly by transaction layer. rxdscr_ nosnoop 20 receive descriptor read ? no snoop indication. read directly by transaction layer. rxdscw_ nosnoop 10b receive descriptor write ? no snoop indication read directly by transaction layer. rxd_ nosnoop 00b receive data write ? no snoop indication read directly by transaction layer. field bit(s) initial value description
82574 gbe controller?driver programing interface 302 10.2.3.2 function?tag register - functag (0x05b08; rw) 10.2.3.3 3gio statistic control regi ster #1 - gscl_1 (0x05b10; rw) field bit(s) initial value description cnt_3_tag 31:29 0x0 tag number for event 6/1d, if located in counter 3. cnt_3_func 28:24 0x0 function number for event 6/1d, if located in counter 3. cnt_2_tag 23:29 0x0 tag number for event 6/1d, if located in counter 2. cnt_2_func 20:16 0x0 function number for event 6/1d, if located in counter 2. cnt_1_tag 15:13 0x0 tag number for event 6/1d, if located in counter 1. cnt_1_func 12:8 0x0 function number for event 6/1d, if located in counter 1. cnt_0_tag 7:5 0x0 tag number for event 6/1d, if located in counter 0. cnt_0_func 4:0 0x0 function number for event 6/1d, if located in counter 0. field bit(s) initial value description gio_count_ start 31 0b start indication of 3gio statistic counters. gio_count_ stop 30 0b stop indication of 3gio statistic counters. gio_count_ reset 29 0b reset indication of 3gio statistic counters. gio_64_bit_ en 28 0b enable two 64-bit counters instead of four 32-bit counters. gio_count_ test 27 0b tes t bi t forward counters for testability. reserved 26:4 0x0 reserved gio_count_ en_3 3 0b enable 3gio statistic counter number 3. gio_count_ en_2 2 0b enable 3gio statistic counter number 2. gio_count_ en_1 1 0b enable 3gio statistic counter number 1. gio_count_ en_0 0 0b enable 3gio statistic counter number 0.
303 driver programing interface?82574 gbe controller 10.2.3.4 3gio statistic control registers #2- gscl_2 (0x05b14; rw) this counter contains the mapping of the event (which counter counts what event). 10.2.3.5 3gio statistic control regi ster #3 - gscl_3 (0x05b18; rw) this counter holds the threshold values need ed for some of the event counting. note that the event increases only after the value passes the threshold boundary. 10.2.3.6 3gio statistic control regi ster #4 - gscl_4 (0x05b1c; rw) this counter holds the threshold values need ed for some of the event counting. note that the event increases only after the value passes the threshold boundary. 10.2.3.7 3gio statistic counter registers #0 - gscn_0 (0x05b20; rw) 10.2.3.8 3gio statistic counter registers #1- gscn_1 (0x05b24; rw) 10.2.3.9 3gio statistic counter registers #2- gscn_2 (0x05b28; rw) field bit(s) initial value description gio_event_ num_3 31:24 0x0 the event number that counter 3 counts gio_event_ num_2 23:16 0x0 the event number that counter counts gio_event_ num_1 15:8 0x0 the event number that counter counts gio_event_ num_0 7:0 0x0 the event number that counter counts field bit(s) initial value description gio_fc_th_0 11:0 0x0 threshold of flow control credits. optional values: 0 = (256-1). reserved 15:12 0x0 reserved gio_fc_th_1 27:16 0x0 threshold of flow control credits. optional values: 0 = (256-1). reserved 31:28 0x0 reserved field bit(s) initial value description reserved 31:16 0x0 reserved gio_rb_th 15:10 0x0 retry buffer threshold. host_coml_ th 9:0 0x0 completions latency threshold.
82574 gbe controller?driver programing interface 304 10.2.3.10 3gio statistic counter registers #3- gscn_3 (0x05b2c; rw) 10.2.3.11 software semaphore register - swsm (0x05b50; rw) 10.2.3.12 3gpio control register 2 - gcr2 (0x05b64; rw) 10.2.3.13 msi?x pba clear - pbaclr (0x5b68; rw1c) field bit(s) initial value description reserved 0 1b reserved swesmbi 1 0b software eeprom semaphore bit this bit should be set only by the so ftware device driver (read only to firmware). the software device driver should set th is bit and then read it to see if it was set. if it was set, it means that the software device driver can read/write from/to the eeprom. the software device driver should clear this bit when finishing its eeprom?s access. hardware clears this bit on gio soft reset. reserved 2 0b reserved reserved 3 0b reserved reserved 31:4 0x0 reserved field bit(s) initial value description reserved 31:1 0x0 reserved reserved 0 0b reserved. must be set to 1b by software du ring initialization. field bit(s) initial value description penbit 4:0 0x0 msi-x pending bit clear writing a 1b to any bit clears th e corresponding msix pba bit; writing 0b has no effect. reserved 31:5 0x0 reserved
305 driver programing interface?82574 gbe controller 10.2.3.14 statistic event mapping transaction layer events event mapping (hex) description dwords of transaction layer packet (tlp) transmitted (transferred to the physical layer), include payload and header. 0 each 125 mhz cycle the counter increases by 1 (1 dword) or 2 (2 dwords). counted: completion, memory, message (not replied). all types of transmitted packets. 1 only tlp packets. each cycle, the counter increase by 1 if tlp packet was transmitted to the link. counted: completion, memory, message (not replied). transmit tlp packets of function #0 2 each cycle, the counter increases by 1, if the packet was transmitted. counted: memory, message of function 0 (not replied). transmit tlp packets of function #1 3 each cycle, the counter increases by 1, if the packet was transmitted. counted: memory, message of function 1 (not replied). non posted transmit tlp packets of function #0 4 each cycle, the counter increases by 1, if the packet was transmitted. counted: memory (np) of function 0 (not replied). non posted transmit tlp packets of function #1 5 each cycle, the counter increases by 1, if the packet was transmitted. counted: memory (np) of function 1 (not replied). transmit tlp packets of function x and tag y, according to func_tag register 6 each cycle, the counter increases by 1, if the packet was transmitted. counted: memory, message for a given func# and tag# (not replied). all types of received packets (tlp only) 1a each cycle, the counter increases by 1, if the packet was received. counted: completion (only good), memory, i/o, config. receive tlp packets of function #0 1b each cycle, the counter increases by 1, if the packet was received. counted: good completions of func#0. reserved 1c reserved receive completion packets 1d each cycle, the counter increases by 1, if the packet was received. counted: good completions for a given func# and tag#. clock counter 20 counts gio cycles. bad tlp from ll 21 each cycle, the counter increases by 1, if a bad tlp is received (bad crc, error reported by al, misplaced special char, reset in thi of received tlp) . header dwords of transaction layer packet transmitted. 25 only tlp, each 125 mhz cycle the counter increases by 1 (1 dword of header) or 2 (2 dwords of the header). counted: completion, memory, message (not replied). header dwords of transaction layer packet received. 26 only tlp, each 125 mhz cycle the counter increases by 1 (1 dword of header) or 2 (2 dwords of the header). counted: completion, memory, message.
82574 gbe controller?driver programing interface 306 transaction layer events event mapping (hex) description transaction layer stalls transmitting due to lack of flow control credits of the next part. 27 the counter counts the number of times the transaction layer stops transmitting because of this (per packet). counted: completion, memory, message. retransmitted packets. 28 the counter increases for each re-transmitted packet. counted: completion, memory, message. stall due to retry buffer full 29 the counter counts the number of times transaction layer stops transmitting because the retry buffer is full (per packet). counted: completion, memory, message. retry buffer is under threshold 2a threshold specified by software, retry buffer is under threshold per packet. counted: completion, memory, message. posted request header (prh) flow control credits (of the next part) below threshold 2b threshold specified by software. the counter increases each time the number of the specific flow control credits is lower than the threshold. counted: according to credit type. posted request data (prd) flow control credits (of the next part) below threshold 2c non-posted request header (nprh) flow control credits (of the next part) below threshold 2d completion header (cplh) flow control credits (of the next part) below threshold 2e completion data (cpld) flow control credits (of the next part) below threshold 2f posted request header (prh) flow control credits (of local part) get to zero. 30 threshold specified by software. the counter increases each time the number of the specific flow control credits reaches the value of zero. (the period that the credit is zero is not counted). counted: according to credit type. non-posted request header (nprh) flow control credits (of local part) get to zero. 31 posted request data (prd) flow control credits (of local part) get to zero. 32 non-posted request data (nprd) flow control credits (of local part) get to zero. 33 dwords of tlp received, include payload and header. 34 each 125 mhz cycle the counter increases by 1 (1 dword) or 2 (2 dwords). counted: completion, memory, message, i/o, config. messages packets received 35 each 125 mhz cycle the counter increases by 1. counted: messages (only good). received packets to func_logic. 36 each 125 mhz cycle the counter increases by 1. counted: memory, i/o, config (only good).
307 driver programing interface?82574 gbe controller host arbiter events event mapping description average latency of read request ? from initialization until end of completions. estimated latency is ~5 ? s 40 + 41 software selects the client that needs to be tested. the statistic counter counts the number of read requests of the required client. in addition, the accumulated time of all requests are saved in a time accumulator. the average time for read request is: [accumulated time/number of read requests]. (event 41 is for the counter). average latency of read request rtt? from initialization until the first completion is arrived (round trip time). estimated latency is 1 ? s 42 + 43 software selects the client that needs to be tested. the statistic counter counts the number of read requests of the required client. in addition, the accumulated time of all rtt are saved in a time accumulator. the average time for read request is: [accumulated time/number of read requests]. (event 43 is for the counter). requests that reached time out. 44 number of requests that reached time out. completion latency above threshold 45 + 46 software selects the client that needs to be tested. software programs the required threshold (in gscl_4 ? units of 96 ns). one statistic counter counts the time from the beginning of the request until end of completions. the other counter counts the number of events. if the time is above threshold ? add 1 to the event counter. (event 46 is for the counter). completion latency above threshold ? for rtt 47 + 48 software selects the client that needs to be tested. software programs the required threshold (in gscl_4 ? units of 96 ns). one statistic counter counts the time from the beginning of the request until first completion arrival. the other counter counts the number of events. if the time is above threshold ? add 1 to the event counter. (event 48 is for the counter). link layer events event mapping description dwords of the packet transmitted (transferred to the physical layer), include payload and header. 50 include dllp (link layer packets) and tlp (transaction layer packets transmitted. each 125 mhz cycle the counter increases by 1 (1 dword) or 2 (2 dwords). dwords of packet received (transferred to the physical layer), include payload and header. 51 include dllp (link layer packets) and tlp (transaction layer packets transmitted. each 125 mhz cycle the counter increases by 1 (1 dword) or 2 (2 dwords). all types of dllp packets transmitted from link layer. 52 each cycle, the counter increases by 1, if dllp packet was transmitted. flow control dllp transmitted from link layer. 53 each cycle, the counter increases by 1, if message was transmitted ack dllp transmitted. 54 each cycle, the counter increases by 1, if message was transmitted. all types of dllp packets received. 55 each cycle, the counter increases by 1, if dllp was received.
82574 gbe controller?driver programing interface 308 10.2.4 interrupt regi ster descriptions 10.2.4.1 interrupt cause read register - icr (0x000c0; rc/wc) link layer events event mapping description flow control dllp received in link layer. 56 each cycle, the counter increases by 1, if message was received. ack dllp received. 57 each cycle, the counter increases by 1, if message was received. nack dllp received. 58 each cycle, the counter increases by 1, if message was transmitted. field bit(s) initial value description txdw 0 0b transmit descriptor written back set when hardware processes a descriptor with rs set. if using delayed interrupts (ide set), the inte rrupt is delayed until after one of the delayed-timers (tidv or tadv) expires. txqe 1 0b transmit queue empty set when the last descriptor block for a transmit queue has been used. when configured to use more than one transmit queue this interrupt indication is i ssued if one of the queues is empty and is not cleared until all the queues have valid descriptors. lsc 2 0b link status change this bit is set whenever the link status changes (either from up to down, or from down to up). this bi t is affected by the link indication from the phy. reserved 3 0b reserved rxdmt0 4 0b receive descriptor minimum threshold hit. this bit indicates that the number of receive descriptors has reached the minimum threshold as set in rctl.rdmts. this indicates to the software to load more receive descriptors. reserved 5 0b reserved rxo 6 0b receiver overrun set on receive data fifo overrun. could be caused either because there are no available buffers or because pcie receive bandwidth is inadequate. rxt0 7 0b receiver timer interrupt set when the timer expires. reserved 8 0b reserved mdac 9 0b mdio access complete set when mdio access completes. see section 10.2.7.36 for details. reserved 14:10 0x0 reserved txd_low 15 0b transmit descriptor low threshold hit indicates that the number of descriptors in the transmit descriptor ring has reached the level specified in the transmit descriptor control register (txdctl.lwthresh). srpd 16 0b small receive packet detected indicates that a packet of size < rsrpd.size has been detected and transferred to host memory. the interrupt is only asserted if rsrpd.size register has a non-zero value.
309 driver programing interface?82574 gbe controller this register contains all interrupt conditions for the 82574. whenever an interrupt causing event occurs, the corresponding interrupt bit is set in this register. a pcie interrupt is generated whenever one of the bits in this register is set, and the corresponding interrupt is enabled via the interrupt mask set/read register. whenever an interrupt causing event occurs, all timers of delayed interrupts are cleared and their cause event is set in the icr. reading from the icr register has differen t effects according to the following three cases: ? case 1 - interrupt mask register equals 0x0000 (mask all): icr content is cleared. ? case 2 - interrupt was asserted (icr.int_assert=1) and auto mask is active: icr content is cleared, and the iam register is written to the imc register. ? case 3 - interrupt was not asserted (icr.int_assert=0): read has no side affect. writing a 1b to any bit in the register also clears that bit. writing a 0b to any bit has no effect on that bit. note: the int_asserted bit is a special case. writing a 1b or 0b to this bit has no affect. it is cleared only when all interrupt sources are cleared. ack 17 0b receive ack frame detected indicates that an ack frame has been received and the timer in raid.ack_delay has expired. mng 18 0b manageability event detected indicates that a manageability event happened. when the device is at power down mode, pme might be gene rated for the same events that would cause an interrupt when the device is at the d0 state. reserved 19 0b reserved rxq0 20 0b receive queue 0 interrupt indicates receive queue 0 write back or receive queue 0 descriptor minimum threshold hit. rxq1 21 0b receive queue 1 interrupt indicates receive queue 1 write back or receive queue 1 descriptor minimum threshold hit. txq0 22 0b transmit queue 0 interrupt indicates transmit queue 0 write back. txq1 23 0b transmit queue 1 interrupt indicates transmit queue 1 write back. other 24 0b other interrupt. indicates one of the following interrupts was set: ? link status change. ? receiver overrun. ? mdio access complete. ? small receive packet detected. ? receive ack frame detected. ? manageability event detected. reserved 30:25 0x0 reserved reads as 0x0. int_ asserted 31 0b interrupt asserted this bit is set when the lan port has a pending interrupt. if the interrupt is enabled in the pci conf iguration space, an interrupt is asserted. field bit(s) initial value description
82574 gbe controller?driver programing interface 310 10.2.4.2 interrupt throttling register - itr (0x000c4; r/w) software can use this register to prevent the condition of repeated, closely spaced, interrupts to the host cpu, asserted by the 82574, by guaranteeing a minimum delay between successive interrupts. to independently validate configuration se ttings, software can use the following algorithm to convert the inter-interrupt interval value to the common interrupts/sec performance metric: interrupts/sec = (256 x 10 -9 sec x interval)-1 for example, if the interval is programmed to 500 (decimal), the 82574 guarantees the cpu is not interrupted by it for 128 ? s from the last interrupt. the maximum observable interrupt rate from the 82574 shou ld never exceed 7813 interrupts/sec. inversely, inter-interrupt interval value can be calculated as: inter-interrupt interval = (256 x 10 -9 sec x interrupts/sec) -1 the optimal performance setting for this re gister is very system and configuration specific. an initial suggested range is 651- 5580 decimal (or 0x28b - 0x15cc). 10.2.4.3 extended interrupt throttle - ei tr (0x000e8 + 4 *n[n = 0..4]; r/w) each eitr is responsible for an msi-x interrupt cause. the allocation of eitr-to- interrupt cause is through the ivar registers. software can use this register to prevent the condition of repeated, closely spaced, interrupts to the host cpu, asserted by the network controller, by guaranteeing a minimum delay between successive interrupts. field bit(s) initial value description interval 15:0 0x0 minimum inter-interrupt intervall the interval is specified in 256 ns increments. zero disables interrupt throttling logic. reserved 31:16 0x0 reserved should be written with 0x0 to ensure future compatibility. field bit(s) initial value description interval 15:0 0x0 minimum inter-interrupt interval the interval is specified in 256 ns increments. zero disables interrupt throttling logic. reserved 31:16 0x0 reserved should be written with 0x0 to ensure future compatibility.
311 driver programing interface?82574 gbe controller 10.2.4.4 interrupt cause set register - ic s (0x000c8; w) software uses this register to set an interru pt condition. any bit written with a 1b sets the corresponding interrupt. this results in the corresponding bit being set in the interrupt cause read register (see section 10.2.4.1 ). a pcie interrupt is also generated if one of the bits in this regist er is set and the corresponding interrupt is enabled via the interrupt mask set/read register (see section 10.2.4.5 ). bits written with 0b are unchanged. field bit(s) initial value description txdw 0 x sets transmit descriptor written back txqe 1 x sets transmit queue empty lsc 2 x sets link status change. reserved 3 x reserved rxdmt0 4 x sets receive descriptor minimum threshold hit reserved 5 x reserved rxo 6 x sets receiver overrun set on receive data fifo overrun. rxt0 7 x sets receiver timer interrupt reserved 8 x reserved mdac 9 x sets mdio acce ss complete interrupt reserved 10 x reserved reserved 11 x reserved reserved 12 x reserved reserved 14:13 x reserved txd_low 15 x transmit descriptor low threshold hit srpd 16 x small receive packet detected and transferred ack 17 x sets receive ack frame detected mng 18 x sets manageability event reserved 19 x reserved rxq0 20 0 sets receive queue 0 interrupt rxq1 21 0 sets receive queue 1 interrupt txq0 22 0 sets transmit queue 0 interrupt txq1 23 0 sets transmit queue 1 interrupt other 24 0 sets other interrupt reserved 31:25 x reserved should be written with 0x0 to ensure future compatibility
82574 gbe controller?driver programing interface 312 10.2.4.5 interrupt mask set/read register - ims (0x000d0; rw) reading this register returns which bits have an interrupt mask set. an interrupt is enabled if its corresponding mask bit is se t to 1b, and disabled if its corresponding mask bit is set to 0b. a pcie interrupt is generated whenever one of the bits in this register is set, and the corresponding interrupt condition occurs. the occurrence of an interrupt condition is reflected by having a bit set in the interrupt cause read register (see section 10.2.4.1 ). a particular interrupt can be enabled by writ ing a 1b to the corresponding mask bit in this register. any bits written with a 0b, are unchanged. thus, if software desires to disable a particular interrupt condition that had been previously enabled, it must write to the interrupt mask clear register (see section 10.2.4.6 ), rather than writing a 0b to a bit in this register. when the ctrl_ext.int_timers_clear_ena bit is set, then following writing all 1b's to the ims register (enable all interrupts) all interrupt timers are cleared to their initial value. this auto clear provides the required latency before the next int event. field bit(s) initial value description txdw 0 0b sets the mask for transmit descriptor written back. txqe 1 0b sets the mask for transmit queue empty. lsc 2 0b sets the mask for link status change. reserved 3 0b reserved rxdmt0 4 0b sets the mask for receive descriptor minimu m threshold hit. reserved 5 0b reserved. rxo 6 0b sets mask for receiver overr un. set on receive data fifo overrun. rxt0 7 0b sets mask for receiver timer interrupt. reserved 8 0b reserved mdac 9 0b sets mask for mdio access complete interrupt. reserved 10 0b reserved reserved 11 0b reserved reserved 12 0b reserved reserved 14:13 0x0 reserved txd_low 15 0b sets the mask for transmit descriptor low threshold hit. srpd 16 0b sets the mask for small receive packet detection. ack 17 0b sets the mask forreceive ack frame detection. mng 18 x sets a manageability event. reserved 19 x reserved rxq0 20 0b sets the mask for receive queue 0 interrupt. rxq1 21 0b sets the mask for receive queue 1 interrupt. txq0 22 0b sets the mask for transmit queue 0 interrupt. txq1 23 0b sets the mask for transmit queue 1 interrupt. other 24 0b sets the mask for other interrupt. reserved 31:25 x0 reserved should be written with 0x0 to ensure future compatibility.
313 driver programing interface?82574 gbe controller 10.2.4.6 interrupt mask clear register - imc (0x000d8; w) software uses this register to disable an interrupt. interrupts are presented to the bus interface only when the mask bit is 1b and the cause bit is 1b. the status of the mask bit is reflected in the interrupt mask set/read register (see section 10.2.4.5 ), and the status of the cause bit is reflected in the interrupt cause read register (see section 10.2.4.4 ). software blocks interrupts by clearing the co rresponding mask bit. this is accomplished by writing a 1b to the corresponding bit in this register. bits written with 0b are unchanged (for example, their mask status does not change). in summary, the sole purpose of this register is to enable software a way to disable certain, or all, interrupts. software disables a given interrupt by writing a 1b to the corresponding bit in this register. field bit(s) initial value description txdw 0 0b clears the mask for transmit descriptor written back. txqe 1 0b clears the mask for transmit queue empty. lsc 2 0b clears the mask for link status change. reserved 3 0b reserved rxdmt0 4 0b clears the mask for receive descriptor minimum threshold hit. reserved 5 0b reserved reads as 0b. rxo 6 0b clears the mask for receiver overrun. set on receive data fifo overrun. rxt0 7 0b clears the mask for receiver timer interrupt. reserved 8 0b reserved mdac 9 0b clears the mask for mdio access complete interrupt. reserved 10 0b reserved reserved 11 0b reserved reads as 0b. reserved 12 0b reserved reserved 14:13 00b reserved txd_low 15 0b clears the mask for transmit descriptor low threshold hit. srpd 16 0b clears the mask for small receive packet detect interrupt. ack 17 0 clears the mask for receive ack frame detect interrupt. mng 18 x clears the mask for a manageability event. reserved 19 x reserved rxq0 20 0 clears the mask for receive queue 0 interrupt. rxq1 21 0 clears the mask for receive queue 1 interrupt. txq0 22 0 clears the mask for transmit queue 0 interrupt. txq1 23 0 clears the mask for transmit queue 1 interrupt. other 24 0 clears the mask for other interrupt. reserved 31:25 0 reserved should be written with 0x0 to ensure future compatibility.
82574 gbe controller?driver programing interface 314 10.2.4.7 interrupt auto clear- eiac (0x000dc; rw) 10.2.4.8 interrupt acknowledge auto ? mask - iam (0x000e0; rw) 10.2.4.9 interrupt vector allocation registers - ivar (0x000e4; rw) this register is only valid in msi-x mode. it defines the allocation of the different interrupt causes to one of the msi-x vectors. each int_alloc[i] (i=0?4) field is indexing an entry in the msi-x tabl e structure and msi-x pba structure. field bit(s) initial value description reserved 19:0 0x0 reserved eiac_value 24:20 0x0 auto clear bits for the corresponding bits of icr. this register is relevant to msi- x mode only, where read-to-clear can not be used, as it might erase causes tied to other vectors. if any bits are set in eiac, the icr register should not be read. bits without auto clear set, need to be cleared with write-to-clear. reserved 31:25 0x0 reserved field bit(s) initial value description iam_value 31:0 0x0 when the ctrl_ext.iame bit is set and the icr.int_assert=1b, an icr read or write has the side effect of writing the contents of this register to the imc register. field bit(s) initial value description int_alloc[0] 2:0 0x0 defines the msi-x vector assigned to the interrupt cause associated with this entry. valid values are 0 to 4 for msi-x mode. note: mapped to receive queue 0 (rxq0). rxq0 associates an interrupt occurring in rx queue 0 wi th a corresponding entry in the msi-x allocation registers. int_alloc_val[0] 3 0 enable bit for rxq0. int_alloc[1] 6:4 0x0 defines the msi-x vector assigned to the interrupt cause associated with this entry. valid values are 0 to 4 for msi-x mode. note: mapped to receive queue 1 (rxq1). rxq1 associates an interrupt occurring in rx queue 0 wi th a corresponding entry in the msi-x allocation registers. int_alloc_val[1] 7 0 enable bit for rxq1. int_alloc[2] 10:8 0x0 defines the msi-x vector assigned to the interrupt cause associated with this entry. valid values are 0 to 4 for msi-x mode. note: mapped to transmit queue 0 (txq0). txq0 associates an interrupt occurring in tx queue 0 with a corresponding entry in the msi-x allocation registers. int_alloc_val[2] 11 0 enable bit for txq0. int_alloc[3] 14:12 0x0 defines the msi-x vector assigned to the interrupt cause associated with this entry. valid values are 0 to 4 for msi-x mode. note: mapped to transmit queue 1 (txq1). txq1 associates an interrupt occurring in tx queue 1 with a corresponding entry in the msi-x allocation registers. int_alloc_val[3] 15 0 enable bit for txq1.
315 driver programing interface?82574 gbe controller note: if invalid values are written to the int_alloc fields the result is unexpected. 10.2.5 receive register descriptions 10.2.5.1 receive control register - rctl (0x00100; rw) int_alloc[4] 18:16 0x0 defines the msi-x vector assigned to the interrupt cause associated with this entry. valid values are 0 to 4 for msi-x mode. note: mapped to other cause. other cause associates an interrupt issued by other causes with a corresponding entry in the msi-x allocation registers. int_alloc_val[4] 19 0 enable bit for other cause. reserved 30:20 0x0 reserved interrupt_on_all _wb 31 0b if set, tx interrupts occur on every write back, regardless of the rs bit. field bit(s) initial value description reserved 0 0b reserved this bit represented as a hardwa re reset of the receive-related portion of the device in previous controllers, but is no longer applicable. only a full device reset ct rl.rst is supported. write as 0b for future compatibility. en 1 0b enable the receiver is enabled when this bi t is set to 1b. writing this bit to 0b, stops reception after receipt of any in progress packet. all subsequent packets are then immediat ely dropped until this bit is set to 1b. sbp 2 0b store bad packets 0b = do not store 1b = store. note that crc errors before the sfd are ignored. any packet must have a valid sfd (rx_dv with no rx_er in the gmii/mii i/f) in order to be recognized by the device (even bad packets). note: bad packets are not routed to manageability even if this bit is set. upe 3 0b unicast promiscuous enable 0b = disabled. 1b = enabled. mpe 4 0b multicast promiscuous enable 0b = disabled. 1b = enabled. lpe 5 0b long packet enable. 0b = disabled. 1b = enabled. lbm 7:6 00b loopback mode should always be set to 00b. 00b = normal operation (or ph y loopback in gmii/mii mode). 01b = mac loopback (test mode). 10b = undefined. 11b = undefined. field bit(s) initial value description
82574 gbe controller?driver programing interface 316 rdmts 9:8 00b receive descriptor minimum threshold size the corresponding interrupt is set whenever the fractional number of free descriptors becomes equal to rdmts. ta b l e 7 8 lists which fractional values correspo nd to rdmts values. see section 10.2.5.7 for details regarding rdlen. dtyp 11:10 00b descriptor type 00b = legacy descriptor type. 01b = packet split descriptor type. 10b = reserved. 11b = reserved. mo 13:12 00b multicast offset this determines which bits of the incoming multicast address are used in looking up the bit vector. 00b = [47:36]. 01b = [46:35]. 10b = [45:34]. 11b = [43:32]. reserved 14 0b reserved bam 15 0b broadcast accept mode 0b = ignore broadcast packets (unless they pass through exact or imperfect filters). 1b = accept broadcast packets. bsize 17:16 0b receive buffer size if rctl.bsex = 0b: 00b = 2048 bytes. 01b = 1024 bytes. 10b = 512 bytes. 11b = 256 bytes. if rctl.bsex = 1b: 00b = reserved; software should not set to this value. 01b = 16384 bytes. 10b = 8192 bytes. 11b = 4096 bytes. bsize is only used when dtyp = 00b. when dtyp = 01b, the buffer sizes for the descriptor are controlled by fields in the psrctl register. bsize is not relevant when flxbuf is different from 0x0, in that case, flxbuf determines the buffer size. vfe 18 0b vlan filter enable. 0b = disabled (filter table does not decide packet acceptance). 1b = enabled (filter table decide s packet acceptance for 802.1q packets). cfien 19 0b canonical form indicator enable 0b = disabled (cfi bit not compared to decide packet acceptance). 1b = enabled (cfi from packet must match next field to accept 802.1q packets). cfi 20 0b canonical form indicator bit value if cfi is set, then 802.1q packets with cfi equal to this field are accepted; otherwise, the 802. 1q packet is discarded. reserved 21 0b reserved should be written with 0b to ensure future compatibility. field bit(s) initial value description
317 driver programing interface?82574 gbe controller lpe controls whether long packet reception is permitted. hardware discards long packets if lpe is 0b. a long packet is one longer than 1522 bytes. rdmts[1,0] determines the threshold value for free receive descriptors according to the following table: table 78. rdmts values bsize controls the size of the receive buffers and permits software to trade-off descriptor performance versus required st orage space. buffers that are 2048 bytes require only one descriptor per receive packet maximizing descriptor efficiency. buffers that are 256 bytes maximize memory efficien cy at a cost of multiple descriptors for packets longer than 256 bytes. three bits control the vlan filter table. the first determines whether the table participates in the packet acceptance criteria. the next two are used to decide whether the cfi bit found in the 802.1q packet should be used as part of the acceptance criteria. dpf controls the dma function of flow cont rol packets addressed to the station address (rah/l[0]). if a packet is a valid flow control packet and is addressed to the station address it is not dma'd to host memory if dpf=1b. dpf 22 0b discard pause frames any valid pause frame is discarded regardless of whether it matches any of the filter registers. 0b = incoming frames subject to filter comparison. 1b = incoming pause frames ignored even if they match filter registers. pmcf 23 0b pass mac control frames 0b = do not (specially) pass mac control frames. 1b = pass any mac control frame (type field value of 0x8808) that does not contain the pause opcode of 0x0001. reserved 24 0b reserved should be written wi th 0b to ensure future compatibility. bsex 25 0b buffer size extension modifies the buffer size indication (bsize). when set to 1b, the original bsize values are multiplied by 16. secrc 26 0b strip ethernet crc from incoming packet. do not dma to host memory. flxbuf 30:27 0x0 determines a flexible buffer size. wh en this field is 0x0000, the buffer size is determined by bsize. if th is field is different from 0x0000, the receive buffer size is the number represented in kb. for example, 0x0001 = 1 kb (1024 bytes). reserved 31 0b reserved should be written wi th 0b to ensure future compatibility. field bit(s) initial value description rdmts free buffer threshold 00b 1/2 01b 1/4 10b 1/8 11b reserved
82574 gbe controller?driver programing interface 318 pmcf controls the dma function of mac contro l frames (other than flow control). a mac control frame in this context must be addressed to either the mac control frame multicast address or the station address, match the type field and not match the pause op-code of 0x0001. if pmcf=1b then fr ames meeting this criteria are dma'd to host memory. the secrc bit controls whether the hardware strips the ethernet crc from the received packet. this stripping occurs prio r to any checksum calculations. the stripped crc is not dma'd to host memory and is no t included in the length reported in the descriptor. 10.2.5.2 packet split receive contro l register - psrctl (0x02170; rw) note: if software sets a buffer size to zero, all bu ffers following that one must be set to zero as well. pointers in the receive descriptors to buffers with a zero size should be set to null pointers. field bit(s) initial value description bsize0 6:0 0x2 receive buffer size for buffer 0. the value is in 128-byte resolution. value can be from 128 bytes to 16256 bytes (15.875 kb). default buffer size is 256 bytes. software should not program this field to a zero value. rsv 7 0b reserved should be written with 0b to ensure future compatibility. bsize1 13:8 0x4 receive buffer size for buffer 1. the value is in 1 kb resolution. value can be from 1 kb to 63 kb. default buffer size is 4 kb. software should not program this field to a zero value. rsv 15:14 00b reserved should be written with 00b to ensure future compatibility. bsize2 21:16 0x4 receive buffer size for buffer 2. the value is in 1 kb resolution. value can be from 1 kb to 63 kb. default buffer size is 4 kb. software can program this field to any value. rsv 23:22 00b reserved should be written with 00b to ensure future compatibility. bsize3 29:24 0x0 receive buffer size for buffer 3 the value is in 1 kb resolution. value can be from 1 kb to 63 kb. default buffer size is 0 kb. software can program this field to any value. rsv 31:30 00b reserved should be written with 0b to ensure future compatibility.
319 driver programing interface?82574 gbe controller 10.2.5.3 flow control receive threshold low - fcrtl (0x02160; rw) this register contains the receive threshold used to determine when to send an xon packet. it counts in units of bytes. the lowe r 3 bits must be programmed to zero (8- byte granularity). software must set xone to enable the transmission of xon frames. whenever hardware crosses the receive high threshold (becoming more full), and then crosses the receive low threshold and xone is enabled (= 1b), hardware transmits an xon frame. note: note that flow control reception/transmission are negotiated capabilities by the auto- negotiation process. when the device is manually configured, flow control operation is determined by the rfce and tfce bits of the device control register. note: this register's address has been moved from where it was located in previous devices. however, for backwards compatibility, this register can also be accessed at its alias offset of 0x00168. 10.2.5.4 flow control receive threshold high - fcrth (0x02168; rw) this register contains the receive threshold used to determine when to send an xoff packet. it counts in units of bytes. this value must be at least 8 bytes less than the maximum number of bytes allocated to the re ceive packet buffer (pba.rxa), and the lower 3 bits must be programmed to zero (8-byte granularity). whenever the receive fifo reaches the fullness indicated by rth, hardware transmits a pause frame if the transmission of flow control frames is enabled. note: note that flow control reception/transmission are negotiated capabilities by the auto- negotiation process. when the device is manually configured, flow control operation is determined by the rfce and tfce bits of the device control register. field bit(s) initial value description reserved 2:0 0x0 reserved the underlying bits might not be implemented in all versions of the chip. must be written with 0x0. rtl 15:3 0x0 receive threshold low fifo low water mark for flow control transmission. reserved 30:16 0x0 reserved reads as 0x0. should be written to 0x0 for future compatibility. xone 31 0b xon enable 0b = disabled. 1b = enabled. field bit(s) initial value description reserved 2:0 0x0 reserved the underlying bits might not be implemented in all versions of the chip. must be written with 0x0. rth 15:3 0x0 receive threshold high fifo high water mark for flow control transmission. reserved 31:16 0x0 reserved reads as 0b. should be written to 0b for future compatibility.
82574 gbe controller?driver programing interface 320 note: this register's address has been moved from where it was located in previous devices. however, for backwards compatibility, this register can also be accessed at its alias offset of 0x00160. 10.2.5.5 receive descriptor base address low - rdbal (0x02800 + n*0x100[n=0..1]; rw) this register contains the lower bits of the 64-bit descriptor base address. the lower 4 bits are always ignored. the receive descriptor base address must point to a 16-byte aligned block of data. note: this register's address has been moved from where it was located in previous devices. however, for backwards compatibility, this register can also be accessed at its alias offset of 0x00110. 10.2.5.6 receive descriptor base address high - rdbah (0x02804 + n*0x100[n=0..1]; rw) this register contains the upper 32 bits of the 64-bit descriptor base address. note: this register's address has been moved from where it was located in previous devices. however, for backwards compatibility, this register can also be accessed at its alias offset of 0x00114. 10.2.5.7 receive descriptor length - rdlen (0x02808 + n*0x100[n=0..1]; rw) this register sets the number of bytes alloca ted for descriptors in the circular descriptor buffer. it must be 128-byte aligned. note: this register's address has been moved from where it was located in previous devices. however, for backwards compatibility, this register can also be accessed at its alias offset of 0x00118. field bit(s) initial value description 0 3:0 0x0 ignored on writes. returns 0b on reads. rdbal 31:4 x receive descriptor base address low field bit(s) initial value description rdbah 31:0 x receive descriptor base address [63:32] field bit(s) initial value description 0 6:0 0x0 ignore on write. reads back as 0x0. len 19:7 0x0 descriptor length reserved 31:20 0x0 reads as 0x0. should be written to 0x0 for future compatibility.
321 driver programing interface?82574 gbe controller 10.2.5.8 receive descriptor head - rdh (0x02810 + n*0x100[n=0..1]; rw) this register contains the head pointer for the receive descriptor buffer. the register points to a 16-byte datum. hardware controls the pointer. the only time that software should write to this register is after a re set (hardware reset or ctrl.rst) and before enabling the receive function (rctl.en). if soft ware were to write to this register while the receive function was enabled, the on-chip descriptor buffers might be invalidated and the hardware could be become unstable. note: this register's address has been moved from where it was located in previous devices. however, for backwards compatibility, this register can also be accessed at its alias offset of 0x00120. 10.2.5.9 receive descriptor tail - rdt (0x02818 + n*0x100[n=0..1]; rw) this register contains the tail pointers for the receive descriptor buffer. the register points to a 16-byte datum. software writes the tail register to add receive descriptors to the hardware free list for the ring. note: this register's address has been moved from where it was located in previous devices. however, for backwards compatibility, this register can also be accessed at its alias offset of 0x00128. 10.2.5.10 rx interrupt delay timer [p acket timer] - rdtr (0x02820; rw) this register is used to delay interrupt no tification for the receive descriptor ring by coalescing interrupts for multiple received packets. delaying interrupt notification helps maximize the number of receive packets serviced by a single interrupt. field bit(s) initial value description rdh 15:0 0x0 receive descriptor head reserved 31:16 0x0 reserved should be written with 0x0 field bit(s) initial value description rdt 15:0 0x0 receive descriptor tail reserved 31:16 0x0 reads as 0x0. should be written to 0x0 for future compatibility. field bit(s) initial value description delay 15:0 0x0 receive packet delay time r measured in increments of 1.024 ? s. reserved 30:16 0x0 reserved reads as 0x0 fpd 31 0x0 flush partial descriptor block when set to 1b, flushes the partial descriptor block; ignored otherwise. reads 0b.
82574 gbe controller?driver programing interface 322 this feature operates by initiating a count down timer upon successfully receiving each packet to system memory. if a subsequent packet is received before the timer expires, the timer is re-initialized to the programme d value and re-starts its countdown. if the timer expires due to not having received a subsequent packet within the programmed interval, pending receive descriptor write backs are flushed and a receive timer interrupt is generated. setting the value to zero represents no delay from a receive packet to the interrupt notification, and results in immediate interrupt notification for each received packet. writing this register with fpd set initiates an immediate expiration of the timer, causing a write back of any consumed receive descrip tors pending write back, and results in a receive timer interrupt in the icr. receive interrupts due to a receive absolu te timer (radv) expiration cancels a pending rdtr interrupt. the rdtr countdown timer is reloaded but halted, so as to avoid generation of a spurious second interru pt after the radv has been noted, but can be restarted by a subsequent received packet. note: fpd is self clearing. note: this register's address has been moved from where it was located in previous devices. however, for backwards compatibility, this register can also be accessed at its alias offset of 0x00108. 10.2.5.11 receive descriptor control - rxdctl (0x02828 + n*0x100[n=0..1]; rw) note: any value written to rxdctl0 is automatically written to rxdctl1. writes to rxdctl1 affects rxdctl1 only. this register controls the fetching and wr ite back of receive descriptors. the three threshold values are used to determine when descriptors are read from and written to host memory. the values can be in units of cache lines or descriptors (each descriptor is 16 bytes) based on the gran flag. if gran=0b (specifications are in cache-line granularity), the thresholds specified (based on the cache line size specified in the pcie header cls field) must not represent greater than 31 descriptors. when (wthresh = 0b) or (wthresh = 1b an d gran = 1b) only descriptors with the rs bit set are written back. field bit(s) initial value description pthresh 5:0 0x00 prefetch threshold rsv 7:6 0x00 reserved hthresh 13:8 0x00 host threshold reserved 14 0b reserved rsv 15 0b reserved wthresh 21:16 0x01 write-back threshold rsv 23:22 00b reserved gran 24 0b granularity units for the thresholds in this register. 0b = cache lines. 1b = descriptors. rsv 31:25 0x0 reserved
323 driver programing interface?82574 gbe controller pthresh is used to control when a prefetch of descriptors are considered. this threshold refers to the number of valid, unprocessed receive descriptors the chip has in its on-chip buffer. if this number drops below pthresh, the algorithm considers pre- fetching descriptors from host memory. this fetch does not happen however, unless there are at least hthresh valid descriptors in host memory to fetch. note: hthresh should be given a non-zero value whenever pthresh is used. wthresh controls the write back of proc essed receive descriptors. this threshold refers to the number of receive descriptors in the on-chip buffer which are ready to be written back to host memory. in the absence of external events (explicit flushes), the write back occurs only after at least wthresh descriptors are available for write back. note: possible values: gran = 1b (descriptor granularity): pthresh = 0..47 wthresh = 0..63 hthresh = 0..63 gran = 0 (cacheline granularity): pthresh = 0..3 (for 16 descriptors cacheline - 256 bytes) wthresh = 0..3 hthresh = 0..4 note: for any wthresh value other than zero - packet and absolute timers must get a non- zero value for wthresh feature to take affect. note: since the default value for write-back thresh old is one, the descriptors are normally written back as soon as one cache line is available. wthresh must contain a non-zero value to take advantage of the write-ba ck bursting capabilities of the 82574. 10.2.5.12 receive interrupt absolute delay timer- radv (0x0282c; rw) if the packet delay timer is used to coalesce receive interrupts, it ensures that when receive traffic abates, an interrupt is generated within a specified interval of no receives. during times when receive traffic is continuous, it might be necessary to ensure that no receive remains unnoticed for t oo long an interval. this register can be used to ensure that a receive interrupt occurs at some predefined interval after the first packet is received. when this timer is enabled, a separate ab solute count-down timer is initiated upon successfully receiving each packet to system memory. when this absolute timer expires, pending receive descriptor write backs are flushed and a receive timer interrupt is generated. field bit(s) initial value description delay 15:0 0x0 receive absolute delay timer me asured in increments of 1.024 ? s (0= disabled). reserved 31:16 0x0 reserved reads as 0x0.
82574 gbe controller?driver programing interface 324 setting this register to 0x0 disables the ab solute timer mechanism (the rdtr register should be used with a value of 0x0 to cause immediate interrupts for all receive packets). receive interrupts due to a receive packet timer (rdtr) expiration cancels a pending radv interrupt. if enabled, th e radv count-down timer is reloaded but halted, so as to avoid generation of a serious second interrupt after the rdtr has been noted. 10.2.5.13 receive small packet detect interrupt- rsrpd (0x02c00; r/w) 10.2.5.14 receive ack interrupt dela y register - raid (0x02c08; rw) if an immediate (non-scheduled) interrupt is desired for any received ack frame, the ack_delay should be set to x00. 10.2.5.15 receive checksum control - rxcsum (0x05000; rw) the receive checksum control register controls the receive checksum offloading features of the 82574. the 82574 supports th e offloading of three receive checksum calculations: the packet checksum, the ip header checksum, and the tcp/udp checksum. field bit(s) initial value description size 11:0 0x0 if the interrupt is enabled any received packet of size <= size asserts an interrupt. size is specified in bytes and includes the headers and the crc. it does not include the vlan header in size calculation if it is stripped. reserved 31:12 x reserved. field bit(s) initial value description rsv 16:31 0x0 reserved ack_delay 15:0 0x0 ack delay timer measured in increments of 1.024 ? s. when the receive ack frame detect interrupt is enabled in the ims register, ack packets being received uses a unique delay timer to generate an interrupt. when an ack is received , an absolute timer loads to the value of ack_delay. the interrupt signal is set only when the timer expires. if another ack packet is re ceived while the timer is counting down, the timer is not reloaded to ack_delay. field bit(s) initial value description pcss 7:0 0x0 packet checksum start ipofld 8 1b ip checksum offload enable tuofld 9 1b tcp/udp checksum offload enable reserved 10 0b reserved crcofl 11 0b crc32 offload enable ippcse 12 0b ip payload checksum enable pcsd 13 0b packet checksum disable reserved 31:14 0x0 reserved
325 driver programing interface?82574 gbe controller pcsd: the packet checksum and ip identification fields are mutually exclusive with the rss hash. only one of the two options is reported in the rx descriptor. the rxcsum.pcsd affect is listed as follows: pcss ippcse: the pcss and the ippcse control th e packet checksum calculation. as previously stated, the packet checksum shares the same location as the rss field. the packet checksum is reported in the receive descriptor when the rxcsum.pcsd bit is cleared. if rxcsum.ippcse cleared (the default value), the checksum calculation that is reported in the rx packet checksum field is the unadjusted 16-bit ones complement of the packet. the packet checksum starts from the byte indicated by rxcsum.pcss (zero corresponds to the first byte of the pack et), after vlan stripping if enabled by the ctrl.vme. for example, for an ethernet ii frame encapsulated as an 802.3ac vlan packet and with rxcsum.pcss set to 14, the packet checksum would include the entire encapsulated frame, excluding the 14-byte ethernet header (da, sa, type/length) and the 4-byte vlan tag. the packet checksum does not include the ethernet crc if the rctl.secrc bit is set. software must make the required offsetting computation (to back out the bytes that should not have been included and to include the pseudo- header) prior to comparing the packet checksum against the tcp checksum stored in the packet. if rxcsum.ippcse is set, the packet checksum is aimed to accelerate checksum calculation of fragmented udp packets. note: the pcss value should not exceed a pointer to ip header start or else it will erroneously calculate ip header checksum or tcp/udp checksum. rxcsum.ipofld is used to enable the ip checksum offloading feature. if rxcsum.ipofld is set to one, the 82574 calculates the ip checksum and indicates a pass/fail indication to software via the ip checksum error bit (ipe) in the error field of the receive descriptor. similarl y, if rxcsum.tuofld is set to one, the 82574 calculates the tcp or udp checksum and indicates a pass/fail indication to software via the tcp/ udp checksum error bit (tcpe). similarly, if rfct l.ipv6_dis and rfctl.ip6xsum_dis are cleared to zero and rxcsum.tuofld is se t to one, the 82574 calculates the tcp or udp checksum for ipv6 packets. it then indicates a pass/fail condition in the tcp/udp checksum error bit (rdesc.tcpe). this applies to checksum offloa ding only. supported frame types: ? ethernet ii ? ethernet snap rxcsum.crcofl is used to enable the crc32 checksum offloading feature. if rxcsum.crcofl is set to one, the 82574 calculates the crc32 checksum and indicates a pass/fail indication to software via the crc32 checksum error bit (crce) in the error field of the receive descriptor. this register should only be initialized (wri tten) when the receiver is not enabled (for example, only write this register when rctl.en = 0b). rxcsum.pcsd 0b (checksum enable) 1b (checksum disable) legacy rx descriptor (rctl.dtyp = 00b) packet checksum is reported in the rx descriptor unsupported configuration. extended or header split rx descriptor (rctl.dtyp = 01b) packet checksum and ip identification are reported in the rx descriptor rss hash value is reported in the rx descriptor.
82574 gbe controller?driver programing interface 326 10.2.5.16 receive filter control register - rfct l (0x05008; rw) 10.2.5.17 management vlan tag value 0 - mavtv0 (0x5010 ; rw) field bit(s) initial value description iscsi_dis 0 0b iscsi disable disable the iscsi filtering. iscsi_dwc 5:1 0x0 iscsi dword count this field indicates the dword count of the iscsi header, which is used for packet split mechanism. nfsw_dis 6 0b nfs write disable disable filtering of nfs write request headers. nfsr_dis 7 0b nfs read disable disable filtering of nfs read reply headers. nfs_ver 9:8 00b nfs version 00b = nfs version 2. 01b = nfs version 3. 10b = nfs version 4. 11b = reserved for future use. ipv6_dis 10 0 b ipv6 disable. disable ipv6 packet filtering. ip6xsum_dis 11 0b ipv6 xsum disable disable xsum on ipv6 packets. ackdis 12 0 b ack accelerate disable when this bit is set, the 82574 does not accelerate interrupt on tcp ack packets. ackd_dis 13 0b ack data disable 1b = the 82574 recognizes ack packets according to the ack bit in the tcp header + no ?cp data 0b = the 82574 recognizes ack packets according to the ack bit only. this bit is relevant only if the ackdis bit is not set. ipfrsp_dis 14 0b ip fragment split disable when this bit is set, the header of ip fragmented packets are not set. exsten 15 0b extended status enable when the exsten bit is set or when the packet split receive descriptor is used, the 82574 writes the extended status to the rx descriptor. reserved 16 0b reserved. reserved 17 0b reserved. reserved 31:18 0x0 reserved should be written with 0x0 to ensure future compatibility. field bit(s) initial value description vlan id 0 11:0 0x0 contains the vlan id that should be compared with the incoming packet if bit 31 is set. rsv 30:12 0x0 reserved en 31 0x0 en enable vid filtering.
327 driver programing interface?82574 gbe controller 10.2.5.18 management vlan tag value 1 - mavtv1 (0x5014 ; rw) 10.2.5.19 management vlan tag va lue 2- mavtv2 (0x5018 ; rw) 10.2.5.20 management vlan tag value 3 - mavtv3 (0x501c ; rw) 10.2.5.21 multicast table array - mta[127:0] (0x05200-0x053fc; rw) there is one register per 32 bits of the multicast address table for a total of 128 registers (thus the mta[127:0] designation). th e size of the word array depends on the number of bits implemented in the multicast address table. software must mask to the desired bit on reads and supply a 32-bit word on writes. note: all accesses to this table must be 32-bit. note: these registers' addresses have been moved from where they were located in previous devices. however, for backwards compatibility, these registers can also be accessed at their alias offsets of 0x00200-0x003fc. field bit(s) initial value description vlan id 1 0-11 0x0 contains the vlan id that should be compared with the incoming packet if bit 31 is set. rsv 12-30 0x0 reserved en 31 0x0 en enable vid filtering. field bit(s) initial value description vlan id 0-11 0x0 contains the vlan id that should be compared with the incoming packet if bit 31 is set. rsv 12-30 0x0 reserved en 31 0x0 en enable vid filtering. field bit(s) initial value description vlan id 0-11 0x0 contains the vlan id that should be compared with the incoming packet if bit 31 is set. rsv 12-30 0x0 reserved en 31 0x0 en enable vid filtering. field bit(s) initial value description bit vector 31:0 x word-wide bit vector specifying 32 bits in the multicast address filter table.
82574 gbe controller?driver programing interface 328 figure 61 shows the multicast lookup algorithm. the destination address shown represents the internally stored ordering of the received da. note that bit 0 indicated in this diagram is the first on the wire. figure 61. multicast table array algorithm 10.2.5.22 receive address low - ral (0x05400 + 8*n; rw) while "n" is the exact unicast/multicast address entry and it is equals to 0,1,?15. these registers contain the lower bits of the 48-bit ethernet address. all 32 bits are valid. if the nvm is present the first register (ral0) is loaded from the nvm. note: these registers' addresses have been moved from where they were located in previous devices. however, for backwards compatibility, these registers can also be accessed at their alias offsets of 0x0040-0x000bc. 47:40 39:32 31:24 23:16 15:8 7:0 bank[1:0] pointer[11:5] multicast table array 32 x 128 (4096 bit vector) ... ... pointer[4:0] word bit ? destination address field bit(s) initial value description ral 31:0 x receive address low the lower 32 bits of the 48-bit ethernet address.
329 driver programing interface?82574 gbe controller 10.2.5.23 receive address high - rah (0x05404 + 8*n; rw) while "n" is the exact unicast/multicast address entry and it is equals to 0,1,?15 these registers contain the upper bits of the 48-bit ethernet address. the complete address is {rah, ral}. av determines whether this address is compared against the incoming packet. av is cleared by a master reset in entries 0-14, and on internal power on reset in entry 15. asel enables the device to perform sp ecial filtering on receive packets. note: the first receive address register (rar0) is also used for exact match pause frame checking (da matches the first register). therefore rar0 should always be used to store the individual ethernet mac address of the 82574. note: these registers' addresses have been moved from where they were located in previous devices. however, for backwards compatibility, these registers can also be accessed at their alias offsets of 0x0040-0x000bc. after reset, if the nvm is present, the first register (receive address register 0) is loaded from the ia field in the nvm, its address select field will be 00b, and its address valid field will be 1b. if no nvm is present the address valid field for n=0b will be 0b. the address valid field for all of the other registers is 0b. note: the software device driver can use only entries 0-14. entry 15 is reserved for manageability firmware usage. 10.2.5.24 vlan filter table array - vfta[127:0] (0x05600-0x057fc; rw) field bit(s) initial value description rah 15:0 x receive address high the upper 16 bits of the 48-bit ethernet address. asel 17:16 x address select selects how the address is to be used. decoded as follows: 00b = destination address (must be set to this in normal mode). 01b = source address. 10b = reserved. 11b = reserved. reserved 30:18 0x0 reserved reads as 0x0. ignored on write. av 31 x address valid cleared after master reset. if the nvm is present, the address valid field of receive address register 0 are set to 1b after a software or pci reset or nvm read. in entries 0-14 this bit is cleared by master reset. the av bit of entry 15 is cleared by internal power on reset. field bit(s) initial value description bit vector 31:0 x double word-wide bit vector spec ifying 32 bits in the vlan filter table.
82574 gbe controller?driver programing interface 330 there is one register per 32 bits of the vlan filter table. the size of the word array depends on the number of bits implemented in the vlan filter table. software must mask to the desired bit on reads and supply a 32-bit word on writes. note: all accesses to this table must be 32-bit. the algorithm for vlan filtering via the vfta is identical to that used for the multicast table array. note: these registers' addresses have been moved from where they were located in previous devices. however, for backwards compatibility, these registers can also be accessed at their alias offsets of 0x00600-0x006fc 10.2.5.25 multiple receive queues comma nd register - mrqc (0x05818; rw) 10.2.5.26 redirection table - reta (0x05c00-0x05c7f; rw) the redirection table is a 128-entry table, each entry is 8-bits wide. only 6 bits of each entry are used (5 bits for the cpu index and 1 bit for queue index). the table is configured through the follo wing read/write registers. . . . field bit(s) initial value description multiple receive queues enable 1:0 00b multiple receive queues enable enables support for multiple re ceive queues and defines the mechanism that controls queue allocation. note that the rxcsum.pcsd bit must also be set to enable multiple receive queues. 00b = multiple receive queues are disabled 01b = multiple receive queues as defined by msft rss. the rss field enable bits define the header fields used by the hash function. 10b = reserved. 11b = reserved. note that this field can be modified only when receive to host is not enabled (rctl.en = 0b). reserved 15:2 0x0 reserved rss field enable 31:16 0x0 each bit, when set, enables a specific field selection to be used by the hash function. several bits can be set at the same time. bit[16] ? enable tcpipv4 hash function bit[17] ? enable ipv4 hash function bit[18] ? enable tcpipv6 hash function bit[19] ? enable ipv6ex hash function bit[20] ? enable ipv6 hash function bits[31:21] ? reserved 31 ?.24 23 16 15 8 7 0 tag 3 tag 2 tag 1 tag 0 tag 127 ? ? ?
331 driver programing interface?82574 gbe controller .. each entry (byte) of the redirection table contains the following information. ? bit [7] - queue index ? bits [6:0] - reserved note: reta cannot be read when rss is enabled. 10.2.5.27 rss random key register - rssrk (0x05c80-0x05ca7; rw) the rss random key register stores a 40-byte key used by the rss hash function (see section 7.1.11.1 ). .. . . . .. field dw/bit(s) initial value description entry 0 0 / 7:0 undefined 1 determines the physical queue for index 0. ? entry 127 31 / 31:24 undefined determin es the physical queue for index 127 1. system software must initialize the tabl e prior to enabling multiple receive queues. 31 ?.24 23 16 15 8 7 0 k[3] k[2] k[1] k[0] k[39] ? ? k[36] field dword/ bit(s) initial value description byte 0 0 / 7:0 0x0?0 byte 0 of the rss random key. ? byte 39 9 / 31:24 0x0?0 byte 39 of the rss random key.
82574 gbe controller?driver programing interface 332 10.2.6 transmit register descriptions 10.2.6.1 transmit control register - tctl (0x00400; rw) field bit(s) initial value description reserved 0 0b reserved write as 0b for future compatibility. en 1 0b enable the transmitter is enabled when this bit is set to 1b. writing this bit to 0b stops transmission after any in progress packets are sent. data remains in the transmit fifo until the device is re-enabled. software should combine this with a reset if the packets in the fifo need to be flushed. reserved 2 0b reserved reads as 0b. should be written to 0b for future compatibility. psp 3 1b pad short packets (with valid data, not padding symbols). 0b = do not pad 1b = pad. padding makes the packet 64 bytes. this is not the same as the minimum collision distance. if padding of short packet is allowed, the value in tx descriptor length field should be not less than 17 bytes. ct 11:4 0x0 collision threshold this determines the number of attempts at re-transmission prior to giving up on the packet (not incl uding the first transmission attempt). while this can be varied, it should be set to a value of 15 in order to comply with the ieee specificatio n requiring a total of 16 attempts. the ethernet back-off algorithm is implemented and clamps to the maximum number of slot times after 10 retries. this field only has meaning while in half-duplex operation. cold 21:12 0b collision distance specifies the minimum number of by te times that must elapse for proper csma/cd operation. packets are padded with special symbols, not valid data bytes. hardware checks and pads to this value plus one byte even in full-duplex operation. swxoff 22 0b software xoff transmission when set to 1b, the device schedules the transmission of an xoff (pause) frame using the current value of the pause timer. this bit self clears upon transmission of the xoff frame. pbe 23 0b packet burst enable the 82574 does not support packet bursting for 1 gb/s half-duplex transmit operation. this bit must be set to 0b. rtlc 24 0b re-transmit on late collision enables the device to re -transmit on a late collision event. this bit is ignored in full-duplex mode. unortx 25 under run no re-transmit txdscmt 27:26 tx descriptor minimum threshold mulr 28 1b multiple request support this bit defines the number of read requests the 82574 issues for transmit data. when set to 0b, the 82574 submits only one request at a time, when set to 1b, the 82574 might submit up to four concurrent requests. the software device driver must not modify this register when the tx head register is not equal to the tail register. this bit is loaded from the nvm word 0x24/0x14.
333 driver programing interface?82574 gbe controller two fields deserve special mention: ct and cold . software might choose to abort packet transmission in less than the ethernet mandated 16 collisions. for this reason, hardware provides ct . wire speeds of 1000 mb/s result in a ve ry short collision radius with traditional minimum packet sizes. cold specifies the minimum number of bytes in the packet to satisfy the desired collision distance. it is important to note that the resulting packet has special characters appended to the end. these are not regular data characters. hardware strips special characters for packets that go from 1000 mb/s environments to 100 mb/s environments. note that the hardware evaluates this field against the packet size in full duplex as well. note: while 802.3x flow control is only defined du ring full duplex operation, the sending of pause frames via the swxoff bit is not gated by the dupl ex settings with in the device. software should not write a 1b to this bit wh ile the device is configured for half-duplex operation. rtlc configures the 82574 to perform retran smission of packets when a late collision is detected. note that the collision wi ndow is speed dependent: 64 bytes for 10/ 100 mb/s and 512 bytes for 1000 mb/s operation. if a late collision is detected when this bit is disabled, the transmit func tion assumes the packet is successfully transmitted. this bit is ignored in full-duplex mode. 10.2.6.2 transmit ipg register - tipg (0x00410; rw) rrthresh 30:29 01b read request threshold these bits define the threshold size for the intermediate buffer to determine when to send the read command to the packet buffer. threshold is defined as follows: rrthresh = 00b threshold = 2 lines of 16 bytes rrthresh = 01b threshold = 4 lines of 16 bytes rrthresh = 10b threshold = 8 lines of 16 bytes rrthresh = 11b threshold = no threshold (transfer data after all of the request is in the rfifo) reserved 31 0b reserved reads as 0b. should be written to 0b for future compatibility. field bit(s) initial value description field bit(s) initial value description ipgt 9:0 0x8 ipg transmit time measured in increments of the mac clock: 8 ns @ 1 gb/s 80 ns @ 100 mb/s 800 ns @ 10 mb/s. ipgr1 19:10 0x8 ipg receive time 1 measured in increments of the mac clock: 8 ns @ 1 gb/s 80 ns @ 100mb/s 800 ns @ 10 mb/s. ipgr2 29:20 0x6 ipg receive time 2 measured in increments of the mac clock: 8 ns @ 1 gb/s 80 ns @ 100 mb/s 800 ns @ 10 mb/s.
82574 gbe controller?driver programing interface 334 this register controls the inter packet gap (i pg) timer. ipgt specifies the ipg length for back-to-back transmissions. ipgr1 contains the length of the first part of the ipg time for non back-to-back transmissions. during this time, the ipg counter restarts if any carrier sense event occurs. once the time specified by ipgr1 has elapsed, carrier sense does not affect the ipg counter. ipgr2 specif ies the total ipg time for non back-to-back transmissions. according to the ieee 802.3 spec, ipgr1 should be 2/3 of ipgr2. ipgr1 and ipgr2 are significant only for half-duplex operation. note: the actual time waited for ipgt and ipgr2 is 6 mac clocks (48 ns @ 1 gb/s) longer than the value programmed in the register. this is due to the implementation of the asynchronous interface between the intern al dma and mac engines. therefore, the suggested value that software should progra m into this register is 0x00602006. this corresponds to: ipgt = 6 (6+6 = total dela y of 12); ipgr1 = 8; and ipgr2 = 6 (6+6 = total delay of 12). also, it should be noted th at this six mac clock delay is longer than implementations. for previous implementations, the actual time waited for any of the ipg timers was two mac clocks (16 ns) longer than the value programmed in the register. thus, for previous implementations, the suggested value for software to program this register was 0x00a00200a. 10.2.6.3 adaptive ifs throttle - ait (0x00458; rw) adaptive ifs throttles back-to-back transmissions in the transmit packet buffer and delays their transfer to the csma/cd transmit function, and thus can be used to delay the transmission of back-to-back packets on the wire. normally, this register should be set to zero. however, if additional delay is desired between back-to-back transmits, then this register can be set with a value greater than zero. the adaptive ifs field provides a similar function to the ipgt field in the tipg register (see section 10.2.6.2 ). however, it only affects the initial transmission timing, not re- transmission timing. note: if the value of the adaptive ifs field is less than the ipg transmit time field in the transmit ipg registers then it has no effect, as the chip selects the maximum of the two values. 10.2.6.4 transmit descriptor base address low - tdbal (0x03800 + n*0x100[n=0..1]; rw) reserved 31:30 0x0 reserved reads as 0b. should be written to 0b for future compatibility. field bit(s) initial value description field bit(s) initial value description aifs 15:0 0x0000 adaptive ifs value this value is in units of 8 ns. reserved 31:16 0x0000 this field should be written with 0x0. field bit(s) initial value description 0 3:0 0x0 ignored on writes. returns 0x0 on reads. tdbal 31:4 x transmit descriptor base address low
335 driver programing interface?82574 gbe controller this register contains the lower bits of the 64-bit descriptor base address. the lower four bits are ignored. the transmit descriptor base address must point to a 16-byte aligned block of data. note: this register?s address has been moved from where it was located in previous devices. however, for backwards compatibility, this register can also be accessed at its alias offset of 0x00420. 10.2.6.5 transmit descriptor base address high - tdbah (0x03804 + n*0x100[n=0..1]; rw) this register contains the upper 32 bits of the 64-bit descriptor base address. note: this register?s address has been moved from where it was located in previous devices. however, for backwards compatibility, this register can also be accessed at its alias offset of 0x00424. 10.2.6.6 transmit descriptor length - tdlen (0x03808+ n*0x100[n=0..1]; rw) this register contains the descriptor length and must be 128-byte aligned. note: this register?s address has been moved from where it was located in previous devices. however, for backwards compatibility, this register can also be accessed at its alias offset of 0x00428. 10.2.6.7 transmit descriptor head - tdh (0x03810 + n*0x100[n=0..1]; rw) this register contains the head pointer for the transmit descriptor ring. it points to a 16-byte datum. hardware controls this pointer. the only time that software should write to this register is after a reset (har dware reset or ctrl.rst) and before enabling the transmit function (tctl.en). field bit(s) initial value description tdbah 31:0 x transmit descriptor base address [63:32] field bit(s) initial value description 0 6:0 0x0 ignore on write. reads back as 0x0. len 19:7 0x0 descriptor length reserved 31:20 0x0 reads as 0x0. should be written to 0x0. field bit(s) initial value description tdh 15:0 0x0 transmit descriptor head reserved 31:16 0x0 reserved should be written with 0x0.
82574 gbe controller?driver programing interface 336 note: if software were to write to this register while the transmit function was enabled, the on-chip descriptor buffers might be invalidated and the hardware could be become unstable. note: this register?s address has been moved from where it was located in previous devices. however, for backwards compatibility, this register can also be accessed at its alias offset of 0x00430. 10.2.6.8 transmit descriptor tail - tdt (0x03818 + n*0x100[n=0..1]; rw) this register contains the tail pointer for the transmit descriptor ring. it points to a 16- byte datum. software writes the tail pointe r to add more descriptors to the transmit ready queue. hardware attempts to transm it all packets referenced by descriptors between head and tail. note: this register?s address has been moved from where it was located in previous devices. however, for backwards compatibility, this register can also be accessed at its alias offset of 0x00438. 10.2.6.9 transmit arbitration count - tarc (0x03840 + n*0x100[n=0..1]; rw) count is the transmit arbitration counter value. the counter is subtracted as a part of the transmit arbitration. field bit(s) initial value description tdt 15:0 0x0 transmit descriptor tail reserved 31:16 0x0 reads as 0. should be written to 0 for future compatibility. field bit(s) initial value description count 6:0 0x3 transmit arbitration count the number of packets that can be sent from queue to make the n over m arbitration between the queues. writing 0x0 to this register is not allowed. comp 7 0b compensation mode when set to 1b, hardware compensa tes this queue according to the compensation ratio if the number of packets in a tcp segmentation in opposite queue caused the counter in that queue to go below zero. ratio 9:8 00b compensation ratio this value determines the ratio between the number of packets transmitted on the opposite queue in a tcp segmentation offload to the number of the packets that are added to this queue as compensation. 00b = 1/1 compensation. 01b = 1/2 compensation. 10b = 1/4 compensation. 11 = 1/8 compensation. enable 10 1b descriptor enable the enable bit of transmit queue 0 should always be set. reserved 26:11 0x0 reserved, reads as 0. should be written to 0 for future compatibility. reserved 30:27 0000b reserved reserved 31 0b reads as 0b. should be written to 0b for future compatibility.
337 driver programing interface?82574 gbe controller it is reloaded to its high (last written) value when it decreased below zero. ? upon a read, hardware returns the current counter value. ? upon a write, the counter updates the high value in the next counter reload. ? the counter can be decreased in chunks (when transmitting tcp segmentation packets). it should never roll because of that. the size of chunks is determined according to the tcp segmentation (number of packets sent). when the counter reaches zero, other tx queues should be selected for transmission as soon as possible (usually after current transmission). comp is the enable bit to compensate between the two queues, when enabled (set to 1b) hardware compensates between the tw o queues if one of the queues is transmitting tcp segmentation packets and its counter went below zero, hardware compensates the other queue according to the ratio in the opposite tarc.ratio register. for example, if the tarc0.count reache d (-5) after sending tcp segmentation packets and both tarc0.comp and tarc1.comp are enabled (set to 1b) and tarc1.ratio is 01b (1/2 compensation) tarc1.count is adjusted by adding 5/2=2 to the current count. ratio is the multiplier to compensate betw een the two queues. the compensation method is described in the previous explanation. 10.2.6.10 transmit interrupt dela y value - tidv (0x03820; rw) this register is used to delay interrupt notification for transmit operations by coalescing interrupts for multiple tran smitted buffers. delaying in terrupt notification helps maximize the amount of transmit buffers reclaimed by a single interrupt. this feature only applies to transmit descriptor operations where: 1. interrupt-based reporting is requested ( rs set). 2. the use of the timer function is requested ( ide is set). this feature operates by initiating a count -down timer upon successfully transmitting the buffer. if a subsequent transmit delayed-interrupt is scheduled before the timer expires, the timer is re-initialized to the programmed value and re-starts its count down. when the timer expires, a transmit-complete interrupt (icr.txdw) is generated. setting the value to 0b is not allowed. if an immediate (non-scheduled) interrupt is desired for any transmit descriptor, the descriptor ide should be set to 0b. the occurrence of either an immediate (non -scheduled) or absolute transmit timer interrupt halts the tidv timer and elim inate any spurious second interrupts. field bit(s) initial value description idv 15:0 0x0 interrupt delay value counts in units of 1.024 microseconds. a value of 0 is not allowed. reserved 30:16 0x0 reads as 0x0. should be written to 0x0 for future compatibility. fpd 31 0b flush partial descriptor block when set to 1b, ignored. reads as 0b.
82574 gbe controller?driver programing interface 338 transmit interrupts due to a transmit absolute timer (tadv) expiration or an immediate interrupt ( rs =1b, ide =0b) cancels a pending tidv interrupt. the tidv countdown timer is re-loaded but halted, though it can be re-started by processing a subsequent transmit descriptor. note: this register?s address has been moved from where it was located in previous devices. however, for backwards compatibility, this register can also be accessed at its alias offset of 0x00440. writing this register with fpd set initiates an immediate expiration of the timer, causing a write back of any consumed transmit descr iptors pending write back, and results in a transmit timer interrupt in the icr. note: fpd is self clearing. 10.2.6.11 transmit descriptor control - txdctl (0x03828 + n*0x100[n=0..1]; rw) this register controls the fetching and write back of transmit descriptors. the three threshold values are used to determine when descriptors are read from and written to host memory. the values can be in units of cache lines or descriptors (each descriptor is 16 bytes) based on the gran flag. note: when gran=1b all descriptors are wri tten back (even if not requested). pthresh is used to control when a prefetch of descriptors are considered. this threshold refers to the number of valid, unprocessed transmit descriptors the chip has in its on-chip buffer. if this number drops below pthresh, the algorithm considers pre- fetching descriptors from host memory. however, this fetch does not happen unless there are at least hthresh valid descriptors in host memory to fetch. note: hthresh should be given a non-zero value when ever pthresh is used. field bit(s) initial value description pthresh 5:0 0x0 prefetch threshold rsv 7:6 0x0 reserved hthresh 13:8 0x0 host threshold rsv 15:14 0x0 reserved wthresh 21:16 0x0 write-back threshold rsv 23:22 0x0 reserved gran 24 0b granularity units for the thresholds in this register. 0b = cache lines 1b = descriptors lwthresh 31:25 0x0 transmit descriptor low threshold interrupt asserted when the number of descriptors pending service in the transmit descriptor queue (processing distance from the tdt) drops below this threshold.
339 driver programing interface?82574 gbe controller wthresh controls the write-back of proce ssed transmit descriptors. this threshold refers to the number of transmit descriptors in the on-chip buffer that are ready to be written back to host memory. in the absence of external events (explicit flushes), the write back occurs only after at least wthr esh descriptors are available for write back. ? possible values: ? gran = 1b (descriptor granularity): ?pthresh = 0..47 ? wthresh = 0..63 ?hthresh = 0..63 ? gran = 0 (cacheline granularity): ? pthresh = 0..3 (for 16 descriptors cacheline - 256 bytes) ? wthresh = 0..3 ?hthresh = 0..4 note: for any wthresh value other than zero - packet and absolute timers must get a non- zero value for the wthresh feature to take affect. note: since the default value for write-back threshold is zero, descriptors are normally written back as soon as they are processed. wthresh must be a non-zero value to take advantage of the write-back bursting capabilities of the 82574. since write-back of transmit descriptors is optional (under the control of rs bit in the descriptor), not all processed descriptors are counted with respect to wthresh. descriptors start accumulating after a descriptor with rs is set. furthermore, with transmit descriptor bursting enabled, some descriptors are written back that did not have rs set in their respective descriptors. note: leaving this value at its default causes descriptor processing to be similar to previous devices. as descriptors are transmitted the number of descriptors waiting in the transmit descriptor queue decreases as noted by the transmit descriptor head and tail positions in the circular queue. when the number of waiting descriptors drops to lwthresh (the head and tail positions are sufficiently close to one another) an interrupt is asserted. lwthresh controls the number of descriptors in transmit ring, at which a transmit descriptor-low interrupt (icr.txd_low) is re ported. this might enable software to operate more efficiently by maintaining a continuous addition of transmit work, interrupting only when the hardware ne ars completion of all submitted work. lwthresh specifies a multiple of eight descr iptors. an interrupt is asserted when the number of descriptors available transition s from (threshold level=8*lwthresh)+1 ? (threshold level=8*lwthresh). setting this value to zero disables this feature. 10.2.6.12 transmit absolute interrup t delay value-tadv (0x0382c; rw) field bit(s) initial value description idv 15:0 0x0 interrupt delay value counts in units of 1.024 ? s. (0b = disabled). reserved 31:16 0x0 reads as 0x0. should be written to 0x0 for future compatibility.
82574 gbe controller?driver programing interface 340 the transmit interrupt delay timer (tidv) can be used to coalesce transmit interrupts. however, it might be necessary to ensure that no completed transmit remains unnoticed for too long an interval in order to ensure timely release of transmit buffers. this register can be used to ensure that a transmit interrupt occurs at some pre- defined interval after a transmit complete s. like the delayed-transmit timer, the absolute transmit timer only applies to transmit descriptor operations where 1. interrupt-based reporting is requested ( rs set). 2. the use of the timer function is requested ( ide is set). this feature operates by initiating a count-down timer upon successfully transmitting the buffer. when the timer expires, a tr ansmit-complete interrupt (icr.txdw) is generated. the occurrence of either an immediate (non-scheduled) or delayed transmit timer (tidv) expiration interrupt halts the tadv timer and eliminates any spurious second interrupts. setting the value to zero, disables the transmit absolute delay function. if an immediate (non-scheduled) interrupt is desired for any transmit descriptor, the descriptor ide should be set to 0b. 10.2.7 statistic register descriptions note: all statistics registers reset when read. in addition, they stick at 0xffff_ffff when the maximum value is reached. note: for the receive statistics it should be noted that a packet is indicated as received if it passes the device?s filters and is placed into the packet buffer memory. a packet does not have to be dma?d to host memory in order to be counted as received. note: due to divergent paths between interrupt-gen eration and logging of relevant statistics counts, it might be possible to generate an interrupt to the system for a noteworthy event prior to the associated statistics co unt actually being incremented. this is extremely unlikely due to expected delays associated with the system interrupt- collection and isr delay, but might be observed as an interrupt for which statistics values do not quite make sense. hardware guarantees that any event noteworthy of inclusion in a statistics count is reflected in the appropriate count within 1 ? s; a small time-delay prior to read of statistics might be necessary to avoid the potential for receiving an interrupt and observing an incons istent statistics count as part of the isr. 10.2.7.1 crc error count - crcerrs (0x04000; r) counts the number of receive packets with crc errors. in order for a packet to be counted in this register, it must pass addre ss filtering and must be 64 bytes or greater (from through , in clusively) in length. if receives are not enabled, then this register does not increment. 10.2.7.2 alignment error count - algnerrc (0x04004; r) field bit(s) initial value description cec 31:0 0x0 crc error count field bit(s) initial value description aec 31:0 0x0 alignment error count
341 driver programing interface?82574 gbe controller counts the number of receive packets with alignment errors (such as the packet is not an integer number of bytes in length). in order for a packet to be counted in this register, it must pass address filtering and must be 64 bytes or greater (from through , incl usively) in length. if receives are not enabled, then this register does not incremen t. this register is valid only in mii mode during 10/100 mb/s operation. 10.2.7.3 rx error count - rxerrc (0x0400c; r) counts the number of packets received in wh ich rx_er was asserted by the phy. in order for a packet to be counted in this register, it must pass address filtering and must be 64 bytes or greater (from through , inclusively) in length. if receives are not enabled, then this register does not increment. 10.2.7.4 missed packets count - mpc (0x04010; r) counts the number of missed packets. packets are missed when the receive fifo has insufficient space to store the incoming packet. this could be caused because of too few buffers allocated, or because there is insufficient bandwidth on the io bus. events setting this counter cause rxo, the receiver overrun interrupt, to be set. this register does not increment if receives are not enabled. note: note that these packets are also counted in the total packets received register as well as in the total octets received register. 10.2.7.5 single collision count - scc (0x04014; r) this register counts the number of times that a successfully transmitted packet encountered a single collision. this register only increments if transmits are enabled and the device is in half-duplex mode. 10.2.7.6 excessive collisions count - ecol (0x04018; r) field bit(s) initial value description rxec 31:0 0x0 rx error count field bit(s) initial value description mpc 31:0 0x0 missed packets count field bit(s) initial value description scc 31:0 0x0 number of times a transmit encountered a single collision. field bit(s) initial value description ecc 31:0 0x0 number of packets with more than 16 collisions.
82574 gbe controller?driver programing interface 342 when 16 or more collisions have occurred on a packet, this register increments, regardless of the value of collision threshold. if collision threshold is set below 16, this counter won?t increment. this register only increments if transmits are enabled and the device is in half-duplex mode. 10.2.7.7 multiple collision count - mcc (0x0401c; r) this register counts the number of times that a transmit encountered more than one collision but less than 16. this register only increments if transmits are enabled and the device is in half-duplex mode. 10.2.7.8 late collisions co unt - latecol (0x04020; r) late collisions are collisions that occur after one slot time. this register only increments if transmits are enabled and the device is in half-duplex mode. 10.2.7.9 collision count - colc (0x04028; r) this register counts the total number of collisio ns seen by the transmitter. this register only increments if transmits are enabled and the device is in half-duplex mode. this register applies to clear as well as secure traffic. 10.2.7.10 defer count - dc (0x04030; r) this register counts defer events. a defe r event occurs when the transmitter cannot immediately send a packet due to th e medium being busy either because: ? another device is transmitting ? the ipg timer has not expired ? hhalf-duplex deferral events ? reception of xoff frames ? the link is not up field bit(s) initial value description mcc 31:0 0x0 number of times a successful transmit encountered multiple collisions. field bit(s) initial value description lcc 31:0 0x0 number of packets with late collisions. field bit(s) initial value description colc 31:0 0x0 total number of collisions experienced by the transmitter. field bit(s) initial value description cdc 31:0 0x0 number of defer events.
343 driver programing interface?82574 gbe controller this register only increments if transmits ar e enabled. the behavior of this counter is slightly different in the 82574 relative to previous devices. for the 82574, this counter does not increment for streaming transmits that are deferred due to tx ipg. 10.2.7.11 transmit with no crs - tncrs (0x04034; r) this register counts the number of successful packet transmissions in which the crs input from the phy was not asserted within on e slot time of start of transmission from the mac. start of transmission is define d as the assertion of tx_en to the phy. the phy should assert crs during every transmission. failure to do so might indicate that the link has failed, or the phy has an inco rrect link configuration. this register only increments if transmits are enabled. this register is only valid when the 82574 is operating at half duplex. 10.2.7.12 carrier extension erro r count - cexterr (0x0403c; r) this register counts the number of packets re ceived in which the carrier extension error was signaled across the gmii interface. the phy propagates carrier extension errors to the mac when an error is detected during the carrier extended time of a packet reception. an extension error is signaled by the phy by the encoding of 0x1f on the receive data inputs while rx_er is asserted to the mac. this register only increments if receives are enabled and the device is operating at 1000 mb/s. 10.2.7.13 receive length error count - rlec (0x04040; r) this register counts receive length error even ts. a length error occurs if an incoming packet passes the filter criteria but is undersized or oversized. packets less than 64 bytes are undersized. packets over 1522 bytes are oversized if longpacketenable is 0b. if longpacketenable (lpe) is 1b, then an incoming, pack et is considered oversized if it exceeds 16384 bytes. if receives are not enabled, this register does not increment. these lengths are based on bytes in the received packet from through , inclusively. field bit(s) initial value description tncrs 31:0 0x0 number of transmissions without a crs assertion from the phy. field bit(s) initial value description cexterr 31:0 0x0 number of packets received with a carrier extension error. field bit(s) initial value description rlec 31:0 0x0 number of packets with receive length errors.
82574 gbe controller?driver programing interface 344 10.2.7.14 xon received count - xonrxc (0x04048; r) this register counts the number of xon packets received. xon packets can use the global address, or the station address. this register only increments if receives are enabled. 10.2.7.15 xon transmitted count - xontxc (0x0404c; r) this register counts the number of xon pack ets transmitted. these can be either due to queue fullness, or due to software initiate d action (using swxoff). this register only increments if transmits are enabled. 10.2.7.16 xoff received count - xoffrxc (0x04050; r) this register counts the number of xoff packets received. xoff packets can use the global address, or the station address. this register only increments if receives are enabled. 10.2.7.17 xoff transmitted count - xofftxc (0x04054; r) this register counts the number of xoff pack ets transmitted. these can be either due to queue fullness, or due to software initiate d action (using swxoff). this register only increments if transmits are enabled. 10.2.7.18 fc received unsupported count - fcruc (0x04058; rw) this register counts the number of unsupported flow control frames that are received. field bit(s) initial value description xonrxc 31:0 0x0 number of xon packets received. field bit(s) initial value description xontxc 31:0 0x0 number of xon packets transmitted. field bit(s) initial value description xoffrxc 31:0 0x0 number of xoff packets received. field bit(s) initial value description xofftxc 31:0 0x0 number of xoff packets transmitted. field bit(s) initial value description fcruc 31:0 0x0 number of unsupporte d flow control frames received.
345 driver programing interface?82574 gbe controller the fcruc counter is incremented when a flow control packet is received that matches either the reserved flow control multicast address (in fcah/l) or the mac station address, and has a matching flow control type field match (to the value in fct), but has an incorrect op-code field. this register only increments if receives are enabled. 10.2.7.19 packets received [64 bytes] count - prc64 (0x0405c; rw) this register counts the number of good packets received that are exactly 64 bytes (from through , inclusively) in length. packets that are counted in the missed packet count register are not counted in this register. this register does not include received flow cont rol packets and increments only if receives are enabled. 10.2.7.20 packets received [65?127 bytes] count - prc127 (0x04060; rw) this register counts the number of good pack ets received that are 65-127 bytes (from through , inclusively) in length. packets that are counted in the missed packet count register are not counted in this register. this register does not include received flow cont rol packets and increments only if receives are enabled. 10.2.7.21 packets received [128?255 bytes] count - prc255 (0x04064; rw) this register counts the number of good packets received that are 128-255 bytes (from through , inclusively) in length. packets that are counted in the missed packet count register are not counted in this register. this register does not include received flow cont rol packets and increments only if receives are enabled. 10.2.7.22 packets received [256?511 bytes] count - prc511 (0x04068; rw) field bit(s) initial value description prc64 31:0 0 number of packets received that are 64 bytes in length. field bit(s) initial value description prc127 31:0 0x0 number of packets received that are 65-127 bytes in length. field bit(s) initial value description prc255 31:0 0x0 number of packets received that are 128-255 bytes in length. field bit(s) initial value description prc511 31:0 0x0 number of packets received that are 256-511 bytes in length.
82574 gbe controller?driver programing interface 346 this register counts the number of good pa ckets received that are 256-511 bytes (from through , in clusively) in length. packets that are counted in the missed packet count register are not counted in this register. this register does not include received flow cont rol packets and increments only if receives are enabled. 10.2.7.23 packets received [512?1023 bytes] count - prc1023 (0x0406c; rw) this register counts the number of good packets received that are 512-1023 bytes (from through , inclusively) in length. packets that are counted in the missed packet count register are not counted in this register. this register does not include received flow cont rol packets and increments only if receives are enabled. 10.2.7.24 packets received [1024 to max bytes] count - prc1522 (0x04070; rw) this register counts the number of good pa ckets received that are from 1024 bytes to the maximum (from th rough , inclusively) in length. the maximum is dependent on the current receiv er configuration (such as, lpe, etc.) and the type of packet being received. if a packet is counted in the receive oversized count register, it is not counted in this register (see section 10.2.7.36 ). this register does not include received flow control packets and only increments if the packet has passed address filtering and receives are enabled. due to changes in the standard for maximum frame size for vlan tagged frames in 802.3, this device accepts packets which have a maximum length of 1522 bytes. the rmon statistics associated with this range has been extended to count 1522 byte long packets. 10.2.7.25 good packets received count - gprc (0x04074; r) this register counts the number of good (non-erred) packets received of any legal length. the legal length for the received pa cket is defined by the value of lpe (see section 10.2.7.13 ). this register does not include received flow control packets and only counts packets that pass filtering. this register only increments if receives are enabled. this register does not count packets counted by the missed packet count (mpc) register. field bit(s) initial value description prc1023 31:0 0x0 number of packets receiv ed that are 512-1023 bytes in length. field bit(s) initial value description prc1522 31:0 0x0 number of packets received that are 1024-maximum bytes in length. field bit(s) initial value description gprc 31:0 0x0 number of good packets received (of any length).
347 driver programing interface?82574 gbe controller 10.2.7.26 broadcast pa ckets received count - bprc (0x04078; r) this register counts the number of good (non-erred) broadcast packets received. this register does not count broadcast packets received when the broadcast address filter is disabled. this register only increments if receives are enabled. 10.2.7.27 multicast packets rece ived count - mprc (0x0407c; r) this register counts the number of good (non-erred) multicast packets received. this register does not count multicast packets received that fail to pass address filtering nor does it count received flow control packets. th is register only increments if receives are enabled. this register does not count packets counted by the missed packet count (mpc) register. 10.2.7.28 good packets transmitted count - gptc (0x04080; r) this register counts the number of good (non-erred) packets transmitted. a good transmit packet is considered one that is 64 or more bytes in length (from through , inclusively) in leng th. this does not include transmitted flow control packets. this register only increments if transmits are enabled. this register does not count packets counted by the missed packet count (mpc) register. the register counts clear as well as secure packets. 10.2.7.29 good octets received count - gorcl (0x04088; r) 10.2.7.30 good octets received count - gorch (0x0408c; r) these registers make up a logical 64-bit register that counts the number of good (non- erred) octets received. this register includes bytes received in a packet from the field through the field, inclusively. this register must be accessed using two independent 32-bit accesses. this register resets whenever the upper 32 bits are read (gorch). field bit(s) initial value description bprc 31:0 0x0 number of broadcast packets received. field bit(s) initial value description mprc 31:0 0x0 number of multicast packets received. field bit(s) initial value description gptc 31:0 0x0 number of good packets transmitted. field bit(s) initial value description gorcl 31:0 0x0 number of good octets received ? lower 4 bytes. gorch 31:0 0x0 number of good octets received ? upper 4 bytes.
82574 gbe controller?driver programing interface 348 in addition, it sticks at 0xffff_ffff_ffff_ffff when the maximum value is reached. only packets that pass address filtering are counted in this register. this register only increments if receives are enabled. these octets do not include octets in received flow control packets. 10.2.7.31 good octets transmi tted count - gotc l (0x04090; r) 10.2.7.32 good octets transmi tted count - gotch (0x04094; r) these registers make up a logical 64-bit re gister that counts the number of good (non- erred) octets transmitted. this register mu st be accessed using two independent 32-bit accesses. this register resets whenever the upper 32 bits are read (gotch). in addition, it sticks at 0xffff_ffff_ffff_ffff when the maximum value is reached. this register includes bytes transmitted in a packet from the field through the field, inclusively. this register counts octets in successfully transmitted packets which are 64 or more bytes in length. this register only increments if transmits are enabled. the register counts clear as well as secure octets. these octets do not include octets in transmitted flow control packets. 10.2.7.33 receive no buffer s count - rnbc (0x040a0; r) this register counts the number of times that frames were received when there were no available buffers in host memory to store those frames (receive descriptor head and tail pointers were equal). the packet is still re ceived if there is space in the fifo. this register only increments if receives are enabled. this register does not increment when flow control packets are received. 10.2.7.34 receive undersize count - ruc (0x040a4; r) this register counts the number of received frames that passed address filtering, and were less than minimum size (64 bytes from through , inclusively), and had a valid crc. this regist er only increments if receives are enabled. field bit(s) initial value description gotcl 31:0 0x0 number of good octets transmitted ? lower 4 bytes. gotch 31:0 0x0 number of good octe ts transmitted ? upper 4 bytes. field bit(s) initial value description rnbc 31:0 0x0 number of receive no buffer conditions. field bit(s) initial value description ruc 31:0 0x0 number of receive undersize errors.
349 driver programing interface?82574 gbe controller 10.2.7.35 receive fragment count - rfc (0x040a8; r) this register counts the number of received frames that passed address filtering, and were less than minimum size (64 bytes from through , inclusively), but had a bad crc (this is slig htly different from the receive undersize count register). this register only increments if receives are enabled. 10.2.7.36 receive oversize count - roc (0x040ac; r) this register counts the number of received frames that passed address filtering, and were greater than maximum size. packets over 1522 bytes are oversized if lpe is 0b. if lpe is 1b, then an incoming, packet is considered oversized if it exceeds 16384 bytes. if receives are not enabled, this register does not increment. these lengths are based on bytes in the received packet from through , inclusively. 10.2.7.37 receive jabber count - rjc (0x040b0; r) this register counts the number of received frames that passed address filtering, and were greater than maximum size and had a bad crc (this is slightly different from the receive oversize count register). packets over 1522 bytes are oversized if lpe is 0b. if lpe is 1b, then an incoming packet is considered oversized if it exceeds 16383 bytes. if receives are not enabled, this register does not increment. these lengths are based on bytes in the received packet from through , inclusively. 10.2.7.38 management packets received count - mngprc (0x040b4; r) field bit(s) initial value description rfc 31:0 0x0 number of receive fragment errors. field bit(s) initial value description roc 31:0 0x0 number of receive oversize errors. field bit(s) initial value description rjc 31:0 0x0 number of receive jabber errors. field bit(s) initial value description mngprc 31:0 0x0 number of management packets received.
82574 gbe controller?driver programing interface 350 this register counts the total number of packets received that pass the management filters, regardless of l3/l4 checksum errors. flow control packets as well as packets with l2 errors are not counted. packets dropped because the management receive fifo was full will be counted. 10.2.7.39 management packets dropped count - mpdc (0x040b8; r) this register counts the total number of packets received that pass the management filters as described in section 3.5 and then are dropped because the management receive fifo is full or the packet is longer than 200 bytes. management packets include rmcp and arp packets. 10.2.7.40 management packets transmitted count - mptc (0x040bc; r) this register counts the total number of packets that are transmitted that are either received over the smbus or are generated by the 82574?s asf function. 10.2.7.41 total octets received - torl (0x040c0; r) 10.2.7.42 total octets received - torh (0x040c4; r) these registers make up a logical 64-bit register that counts the total number of octets received. this register must be accessed using two independent 32-bit accesses. this register resets whenever the upper 32 bits are read (torh). in addition, it sticks at 0xffff_ffff_ffff_ffff when the maximum value is reached. all packets received have their octets summed into this register, regardless of their length, whether they are erred, or whether th ey are flow control packets. this register includes bytes received in a packet from the field through the field, inclusively. this register only increments if receives are enabled. note: broadcast rejected packets are counted in this counter (in contradiction to all other rejected packets that are not counted). field bit(s) initial value description mpdc 31:0 0x0 number of management packets dropped. field bit(s) initial value description mptc 31:0 0x0 number of management packets transmitted. field bit(s) initial value description torl 31:0 0x0 number of total octets received ? lower 4 bytes. torh 31:0 0x0 number of total octets received ? upper 4 bytes.
351 driver programing interface?82574 gbe controller 10.2.7.43 total octets transmitted - tot (0x040c8; rw) these registers make up a logical 64-bit register that counts the total number of octets transmitted. this register must be acce ssed using two independent 32-bit accesses. this register resets whenever the upper 32 bits are read (toth). in addition, it sticks at 0xffff_ffff_ffff_ffff when the maximum value is reached. all transmitted packets have their octets summ ed into this register, regardless of their length or whether they are flow control packets. this register includes bytes transmitted in a packet from the field through the field, inclusively. octets transmitted as part of partial packet transmissions (for example, collisions in half-duplex mode) are not included in this register. this register only increments if transmits are enabled. 10.2.7.44 total packets received - tpr (0x040d0; rw) this register counts the total number of all packets received. all packets received are counted in this register, regardless of their length, whether they are erred, or whether they are flow control packets. this register only increments if receives are enabled. note: broadcast rejected packets are counted in this counter (in contradiction to all other rejected packets that are not counted). 10.2.7.45 total packets transmitted - tpt (0x040d4; rw) this register counts the total number of all packets transmitted. all packets transmitted will be counted in this register, regardless of their length, or whether they are flow control packets. partial packet transmissions (for example, collisions in half-duplex mode) are not included in this register. this register only increments if transmits are enabled. this register counts all packets, including standard packets, secure packets, packets received over the smbus and packets generated by the asf function. field bit(s) initial value description totl 31:0 0x0 number of total octets transmitted ? lower 4 bytes. toth 31:0 0x0 number of total octets transmitted ? upper 4 bytes. field bit(s) initial value description tpr 31:0 0x0 number of all packets received. field bit(s) initial value description tpt 31:0 0x0 number of all packets transmitted.
82574 gbe controller?driver programing interface 352 10.2.7.46 packets transmitted [64 bytes] count - ptc64 (0x040d8; rw) this register counts the number of packets transmitted that are exactly 64 bytes (from through , in clusively) in length. partial packet transmissions (for example, collisions in ha lf-duplex mode) are not included in this register. this register does not include transmitted flow control packets (which are 64 bytes in length). this register only incremen ts if transmits are enabled. this register counts all packets, including standard packets, secure packets, packets received over the smbus and packets generated by the asf function. 10.2.7.47 packets transmitted [65?127 bytes] count- ptc127 (0x040dc; rw) this register counts the number of packet s transmitted that are 65-127 bytes (from through , in clusively) in length. partial packet transmissions (for example, collisions in ha lf-duplex mode) are not included in this register. this register only increments if tr ansmits are enabled. this register counts all packets, including standard packets, secure packets, packets received over the smbus and packets generated by the asf function. 10.2.7.48 packets transmitted [128?255 bytes] count - ptc255 (0x040e0; rw) this register counts the number of packet s transmitted that are 128-255 bytes (from through , in clusively) in length. partial packet transmissions (for example, collisions in ha lf-duplex mode) are not included in this register. this register only increments if tr ansmits are enabled. this register counts all packets, including standard packets, secure packets, packets received over the smbus and packets generated by the asf function. field bit(s) initial value description ptc64 31:0 0x0 number of packets transm itted that are 64 bytes in length. field bit(s) initial value description ptc127 31:0 0x0 number of packets transm itted that are 65-127 bytes in length. field bit(s) initial value description ptc255 31:0 0x0 number of packets transm itted that are 128-255 bytes in length.
353 driver programing interface?82574 gbe controller 10.2.7.49 packets transmitted [256?511 bytes] count - ptc511 (0x040e4; rw) this register counts the number of packets transmitted that are 256-511 bytes (from through , inclusively) in length. partial packet transmissions (for example, collisions in ha lf-duplex mode) are not included in this register. this register only increments if tr ansmits are enabled. this register counts all packets, including standard and secure packets. management packets are never more than 200 bytes. 10.2.7.50 packets transmitted [512?1023 bytes] count - ptc1023 (0x040e8; rw) this register counts the number of packets transmitted that are 512-1023 bytes (from through , inclusively) in length. partial packet transmissions (for example, collisions in ha lf-duplex mode) are not included in this register. this register only increments if tr ansmits are enabled. this register counts all packets, including standard and secure packets. management packets are never more than 200 bytes. 10.2.7.51 packets transmitted [greater than 1024 bytes] count - ptc1522 (0x040ec; rw) this register counts the number of packets transmitted that are 1024 or more bytes (from through , inclusively) in length. partial packet transmissions (for example, collisions in ha lf-duplex mode) are not included in this register. this register only increments if transmits are enabled. due to changes in the standard for maximum frame size for vlan tagged frames in 802.3, this device transmits packets that have a maximum length of 1522 bytes. the rmon statistics associated with this range has been extended to count 1522 byte long packets. this register counts all packets, including standard and secure packets. management packets are never more than 200 bytes. 10.2.7.52 multicast packets transm itted count - mptc (0x040f0; rw) field bit(s) initial value description ptc511 31:0 0x0 number of packets transmitted that are 256-511 bytes in length. field bit(s) initial value description ptc1023 31:0 0x0 number of packets transmitted that are 512-1023 bytes in length. field bit(s) initial value description ptc1522 31:0 0x0 number of packets transmitted that are 1024 or more bytes in length. field bit(s) initial value description mptc 31:0 0x0 number of multicast packets transmitted.
82574 gbe controller?driver programing interface 354 this register counts the number of multicast packets transmitted. this register does not include flow control packets and increments only if transmits are enabled. counts clear as well as secure traffic. 10.2.7.53 broadcast packets transmitted count - bptc (0x040f4; rw) this register counts the number of broadcas t packets transmitted. this register only increments if transmits are enabled. this register counts all packets, including standard and secure packets. management packets are never more than 200 bytes. 10.2.7.54 tcp segmentation context tr ansmitted count - tsctc (0x040f8; rw) this register counts the number of tcp segmentation offload transmissions and increments once the last portion of the tcp segmentation context payload is segmented and loaded as a packet into the on-chip transmit buffer. note that it is not a measurement of the number of packets sent out (covered by other registers). this register only increments if transmits and tcp segmentation offload are enabled. 10.2.7.55 tcp segmentation context tr ansmit fail count - tsctfc (0x040fc; rw) this register counts the number of tcp segmentation offload requests to the hardware that failed to transmit all data in the tcp segmentation context payload. there is no indication by hardware of how much data wa s successfully transmitted. only one failure event is logged per tcp segmentation context. failures could be due to paylen errors. this register will only increment if transmits are enabled. 10.2.7.56 interrupt assertion count- iac (0x04100; r) this counter counts the total number of interrupts generated in the system. field bit(s) initial value description bptc 31:0 0x0 number of broadcast packets transmitted count. field bit(s) initial value description tsctc 31:0 0x0 number of tcp segmentation contexts transmitted count. field bit(s) initial value description tsctfc 31:0 0x0 number of tcp segmentation contexts where the device failed to transmit the entire data payload. field bit(s) initial value description iac 0-31 0x0 this is a count of the legacy interrupt assertions that have occurred.
355 driver programing interface?82574 gbe controller 10.2.8 management re gister descriptions 10.2.8.1 wake up control register - wuc (0x05800; rw) the pme_en and pme_status bits are reset when internal power on reset is 0b. when d3 cold is not supported, these bits are also reset by the de-assertion (rising edge) of pci_rst_n. the other bits are reset on the standard internal resets. see section 4.4.1 for details. field bit(s) initial value description apme 0 0b advance power management enable if 1b, apm wakeup is enabled (see section 5.5.1 ). this bit is loaded from nvm. pme_en 1 0b pme_en this read/write bit is used by the software device driver to access the pme_en bit of the power management control / status register (pmcsr) without writing to pcie configuration space. pme_status 2 0b pme_status this bit is set when the 82574 receives a wake-up event. it is the same as the pme_status bit in the pmcsr. writ ing a 1b to this bit clears the pme_status bit in the pmcsr. apmpme 3 0b assert pme on apm wakeup if set to 1b, the 82574 sets the pme_status bit in the pmcsr and asserts pe_wake_n when apm wake up is enabled and the 82574 receives a matching magic packetsee section 5.5.1 ). lscwe 4 0b link status change wake enable enables wake on link status change as part of apm wake capabilities. lscwo 5 0b link status change wake override if set to 1b, wake on link status change does not depend on the lnkc bit in the wake up filter control (wufc) register. instead, it is determined by the apm settings in the wuc register (see section 10.2.7.36 ). this bit is loaded from nvm. ftfa1 6 0b flexible tco filter 1 allocation 1b = allocate flex tco1 filter for wake. 0 b= allocate flex tco1 filter for manageability. ftf1_en 7 0b flexible tco filter 1 enable when set, flex tco1 filter is enabled for wake up. when cleared, flex tco1 filter is disabled. this bit takes affect only when the ftfa1 bit is set (for example, flex tco1 filter is allocated for apm wake). ftfa0 8 0b flexible tco filter 0 allocation 1b = allocate flex tco0 filter for wake. 0b = allocate flex tco0 filter for manageability. ftf0_en 9 0 flexible tco filter 0 enable when set, flex tco0 filter is enabled for wake up. when cleared, flex tco0 filter is disabled. this bit takes affect only when the ftfa0 bit is set (for example, flex tco0 filter is allocated for wake). reserved 31:8 0 reserved.
82574 gbe controller?driver programing interface 356 10.2.8.2 wake up filt er control register - wufc (0x05808; rw) this register is used to enable each of th e pre-defined and flexible filters for wake-up support. a value of one means the filter is turned on, and a value of zero means the filter is turned off. if the notco bit is set, then any packet that pa sses the manageability packet filtering described in section 3.5 does not cause a wake-up event even if it passes one of the wake-up filters. 10.2.8.3 wake-up status register - wus (0x05810; rw) field bit(s) initial value description lnkc 0 0b link status change wake up enable mag 1 0b magic packet wake up enable ex 2 0b directed exact wake up enable mc 3 0b directed multicast wake up enable bc 4 0b broadcast wake up enable arp 5 0b arp/ipv4 request packet wake up enable ipv4 6 0b directed ipv4 packet wake up enable ipv6 7 0b directed ipv6 packet wake up enable reserved 14:8 0 reserved notco 15 0b ignore tco packets for tco flx0 16 0b flexible filter 0 enable flx1 17 0b flexible filter 1 enable flx2 18 0b flexible filter 2 enable flx3 19 0b flexible filter 3 enable reserved 31:20 0x0 reserved field bit(s) initial value description lnkc 0 0b link status changed mag 1 0b magic packet received ex 2 0b directed exact packet received the packet?s address matched one of the 16 pre-programmed exact values in the receive address registers. mc 3 0b directed multicast packet received the packet was a multicast packet that was hashed to a value corresponding to a 1-bit, in the multicast table array . bc 4 0b broadcast packet received arp 5 0b arp/ipv4 request packet received ipv4 6 0b directed ipv4 packet received ipv6 7 0b directed ipv6 packet received reserved 8 0b reserved tco0 9 0b flexible tco filter 0 match when allocated to wake up.
357 driver programing interface?82574 gbe controller this register is used to record statistics about all wake-up packets received. if a packet matches multiple criteria than multiple bits could be set. writing a 1b to any bit clears that bit. this register is not cleared when pci_rst_n is asserted. it is only cleared when internal power on reset is de-asserted or when cleared by the software device driver. 10.2.8.4 management flex udp/tcp ports 0/1 - mfutp01 (0x05828; rw) 10.2.8.5 management flex udp/tcp port 2/3 - mfutp23 (0x05830; rw) 10.2.8.6 ip address valid - ipav (0x5838; rw) the ip address valid register indicates whethe r the ip addresses in the ip address table are valid: tco1 10 0b flexible tco filter 1 match when allocated to wake up. reserved 15:11 0x0 reserved flx0 16 0b flexible filter 0 match flx1 17 0b flexible filter 1 match flx2 18 0b flexible filter 2 match flx3 19 0b flexible filter 3 match reserved 31:20 0x0 reserved field bit(s) initial value description field bit(s) initial value description mfutp0 15:0 0x0 0 management flex udp/tcp port these bits can also be configured from the smbus. mfutp1 31:16 0x0 1 management flex udp/tcp port these bits can also be configured from the smbus. field bit(s) initial value description mfutp2 15:0 0x0 2 management flex udp/tcp port these bits can also be configured from the smbus. mfutp3 31:16 0x0 3 management flex udp/tcp port these bits can also be configured from the smbus. field bit(s) initial value description v40 0 0b 1 ipv4 address 0 valid v41 1 0b ipv4 address 1 valid v42 2 0b ipv4 address 2 valid v43 3 0b ipv4 address 3 valid reserved 15:4 0x0 reserved
82574 gbe controller?driver programing interface 358 10.2.8.7 ipv4 address table - ip4at (0x05840?0x05858; rw) the ipv4 address table register is used to store the four ipv4 addresses for arp/ipv4 request packet and directed ipv4 packet wake up. the first entry is also used to store the ip address used for routing rmcp and optionally arp packets to the smbus or internal asf function. it has the following format: 10.2.8.8 management control register - manc (0x05820; rw) this register is written by the mc and should not be written by the host. v60 16 0b ipv6 address 0 valid reserved 31:17 0x0 reserved 1. the initial value is loaded from the ip address valid bit of the nvm?s management control register field bit(s) initial value description dword# address 31 0 0 0x5840 ipv4addr0 2 0x5848 ipv4addr1 3 0x5850 ipv4addr2 4 0x5858 ipv4addr3 field dword # address bit(s) initial value description ipv4addr0 0 0x5840 31:0 x ipv4 address 0 (least significant byte is first on the wire). ipv4addr1 2 0x5848 31:0 x ipv4 address 1 ipv4addr2 4 0x5850 31:0 x ipv4 address 2 ipv4addr3 6 0x5858 31:0 x ipv4 address 3 field bit(s) initial value description reserved 15:0 0x0 reserved tco_reset 16 0b tco reset occurred set to 1b on a tco reset. this bit is only reset by internal power on reset. rcv_tco_en 17 0b receive tco packets enabled when this bit is set, it enables th e receive flow from the wire to the manageability block. 1 keep_phy_ link_up 18 0b block phy reset and power state changes. when this bit is set, the phy is not reset on pe_rst_n or in-band pcie reset and it does not change its power state. this bit cannot be written unless no_phy_rst eeprom bit is set. this bit is reset by internal power on reset. rcv_all 19 0b receive all enable when set, all received packets that passed l2 filtering are directed to the manageability block. when rcv_all is set to 1b, no other manageability filters should be set - all traffic is directed to the manageability subsystem.
359 driver programing interface?82574 gbe controller 10.2.8.9 management control to host register - manc2h (0x5860; rw) the manc2h register enables routing of ma nageability packets to the host based on the decision filter that routed it to the manageability micro-controller. each manageability decision filter (mdef) has a corresponding bit in the manc2h register. when an mdef routes a packet to manageability, it also routes the packet to the host if the corresponding manc2h bit is set and if the en_mng2host bit is set. the en_mng2host bit serves as a global enable for the manc2h bits. reset - the manc2h register is cleared on internal power on reset. mcst_pass_ l2 20 0b receive all multicast when set, all received multicast pa ckets pass l2 filtering (similar as host promiscuous multicast). these packets can be directed to the manageability block by a one of the decision filters. broadcast packets are not forwarded by this bit. en_ mng2host 21 0b enable manageability packets to host memory this bit enables the functionality of the manc2h register. when set, the packets that are specified in the manc2h registers are forwarded to host memory too, if they pass manageability filters. reserved 22 0b reserved en_xsum_ filter 23 0b enable xsum filtering to manageability when this bit is set, only packets that passes l3 and l4 checksum are sent to the manageability block. reserved 24 0b reserved fixed_net_ type 25 0b fixed net type if set, only packets matching the net type defined by the net_type field passes to manageability. otherwise, both tagged and un-tagged packets can be forwarded to manageability engine. net_type 26 0b net type: 0b = pass only un-tagged packets. 1b = pass only vlan tagged packets. valid only if fixed_net_type is set . reserved 27 0b reserved dis_ip_addr _for_arp 28 1b disable ip address checking for arp packets when set, the ip address is not checked for a match on arp packets. when cleared, an arp request packet is passed to the mc only if the ip filter was configured and there is a match with one of the four programmed ipv4. this bit affects manageability filterin g only. it does not affect wake-up arp. reserved 31:29 0x0 reserved 1. when set, this bit actually indicates the presence of a manageability entity. therefore, it prevents the phy from being powered down while in power saving states. when this bit is cleared, the phy might be powered down, so transmit flow might not be possible as well. it's therefore recommended to set this bit when the bmc needs to enable either receive or transmit. field bit(s) initial value description field bit(s) initial value description host enable 7:0 0x0 host enable when set, indicates that packets ro uted by the manageability filters to manageability are also sent to the hos t. bit 0 corresponds to decision rule 0, etc. reserved 31:8 0x0 reserved
82574 gbe controller?driver programing interface 360 10.2.8.10 manageabilit y filters valid - mfval (0x5824; rw) the manageability filters valid register indicates which filter registers contain a valid entry. reset - the mfval register is cleared on internal power on reset. 10.2.8.11 manageability decision filter s - mdef (0x5890 + 4*n [n=0..7]; rw) field bit(s) initial value description mac 0 0b mac indicates if the mac unicast filter registers (rah[15], ral[15]) contains valid mac addresses. reserved 7:1 0x0 reserved vlan 11:8 0x0 vlan indicates if the vlan filter register (mavtv) contain valid vlan tags. bit 8 corresponds to filter 0, etc. reserved 15:12 0x0 reserved ipv4 16 0b ipv4 indicates if the ipv4 address filter (ip4at[0]) contains a valid ipv4 address. reserved 23:17 0x0 reserved ipv6 24 0b ipv6 indicates if the ipv6 address filter (ip6at) contains a valid ipv6 address. reserved 31:25 0x0 reserved field bit(s) initial value description unicast and 0 0b unicast controls the inclusion of unicast ad dress filtering in the manageability filter decision (and section). broadcast and 10b broadcast controls the inclusion of broa dcast address filtering in the manageability filter decision (and section). vlan and 2 0b vlan controls the inclusion of vlan ad dress filtering in the manageability filter decision (and section). ip address 3 0b ip address controls the inclusion of ip address filtering in the manageability filter decision (and section). unicast or 4 0b unicast controls the inclusion of unicast ad dress filtering in the manageability filter decision (or section). broadcast or 5 0b broadcast controls the inclusion of broa dcast address filtering in the manageability filter decision (or section). multicast and 6 0b multicast controls the inclusion of mult icast address f iltering in the manageability filter decision (and section). broadcast packets are not included by this bit. the packet mu st pass some l2 filtering to be included by this bit ? either by the manc.mcst_pass_l2 or by some dedicated mac address.
361 driver programing interface?82574 gbe controller 10.2.8.12 ipv6 address table - ip6at (0x05880?0x0588f; rw) the ipv6 address table register is used to store the ipv6 addresses for neighbor solicitation packet filtering and directed ipv6 packet wake up and it has the following format: .. .. arp request 7 0b arp request controls the incl usion of arp request filter ing in the manageability filter decision (or section). arp response 8 0b arp response controls the inclusion of arp respon se filtering in the manageability filter decision (or section). neighbor discovery (solicitation) 90b neighbor solicitation controls the incl usion of neighbor solici tation filtering in the manageability filter decision (or section). port 0x298 10 0b port 0x298 controls the inclusion of port 0x 298 filtering in the manageability filter decision (or section). port 0x26f 11 0b port 0x26f controls the inclusion of port 0x26f filtering in the manageability filter decision (or section). flex port 15:12 0x0 flex port controls the inclus ion of flex port filtering in the manageability filter decision (or section). bit 12 corresponds to flex port 0, etc. reserved 27:16 0x0 reserved flex tco 29:28 00b flex tco controls the inclusion of flex tco filtering in the manageability filter decision (or section). bit 28 corresponds to flex tco filter 0, etc. reserved 31:30 00b reserved field bit(s) initial value description dword# address 31 0 0 0x5880 ipv6addr0 1 0x5884 2 0x5888 3 0x588c field dword# address bit(s) initial value description ipv6addr0 0 0x5880 31:0 x ipv6 address 0, bytes 1-4 (least signficiant byte is first on the wire). 1 0x5884 31:0 x ipv6 address 0, bytes 5-8 2 0x5888 31:0 x ipv6 address 0, bytes 9-12 3 0x588c 31:0 x ipv6 address 0, bytes 13-16
82574 gbe controller?driver programing interface 362 10.2.8.13 wake up packet memory [128 bytes] - wupm (0 x05a00-0x05a7c; r) this register is read only and it is used to store the first 128 bytes of the wake-up packet for software retrieval after the system wakes up. it is not cleared by any reset. 10.2.8.14 function active and power state to mng - factps (0x05b30; ro) this register is used by the 82574 firmware for configuration. 10.2.8.15 flexible filter length table - fflt (0x05f00?0x05f28; rw) the flexible filter length table register stores the minimum packet lengths required to pass each of the flexible filters. any packets that are shorter than the programmed length won?t pass that filter. each flexible filter considers a packet that doesn?t have any mismatches up to that point to have passed the flexible filter when it reaches the required length. it does not check any bytes past that point. field bit(s) initial value description wupd 31:0 x wake up packet data field bit(s) initial value description reserved 31 0b reserved reserved 30 0b reserved reserved 29 1b reserved reserved 28:9 0x0 reserved reserved 8 0b reserved reserved 7:4 0x0 reserved func0 aux_en 3 0b function 0 auxiliary (aux) power pm enable bit shadow from the configuration space. lan0 valid 2 1b lan 0 enable hardwired to 1b. func0 power state 1:0 00b power state indication of function 0 00 b-> dr 01b -> d0u 10b -> d0a 11b -> d3 field dword # address bit(s) initial value description len0 0 0x5f00 10:0 0 minimum length for flexible filter 0 len1 2 0x5f08 10:0 0 minimum length for flexible filter 1 len2 4 0x5f10 10:0 0 minimum length for flexible filter 2 len3 6 0x5f18 10:0 0 minimum length for flexible filter 3 len tco 0 8 0x5f20 10:0 0(nvm) minimum length for flexible tco0 filter len tco 1 10 0x5f28 10:0 0(nvm) minimum length for flexible tco1 filter
363 driver programing interface?82574 gbe controller all reserved fields read as 0b?s and ignore writes. bits 10:8 must be written as 0b. note: before writing to the flexible filter length table, the software device driver must first disable the flexible filter s by writing 0b?s to the flexible filter enable bits of the wake up filter control (wufc.flxn) register. 10.2.8.16 flexible filter mask table - ffmt (0x09000?0x093f8; rw) the flexible filter mask table register is used to store the four 1-bit masks for each of the first 128 data bytes in a packet, one for each flexible filter. if the mask bit is 1b, the corresponding flexible filter compares the in coming data byte at the index of the mask bit to the data byte stored in the flexible filter value table. note: the table is organized to permit expansion to eight (or more) filters and 256 bytes in a future product without changing the address map. note: before writing to the flexible filter mask ta ble, the software device driver must first disable the flexible filter s by writing 0b?s to the flexible filter enable bits of the wake up filter control (wufc.flxn) register. 10.2.8.17 flexible tco filter table - ftft (0x09400?0x097f8; rw) these registers can be used by software to u pdate the flex-tco filter bytes that should be compared. as opposed to the wake-up table this structure contains the byte value and the bit mask in the same address. bits 7:0 and 8 are used for flex tco filter 0 and bits 16:9 and 17 are used for flex tco filter 1. the tco flexible filters are enabled for manageability filtering if: ? bits 28,29 are set in any of manageability decision filters (mdef). bit 28 enables flex tco0 filter, bit 29 enables flex tco1 filter. ? bits ftfa0/1 in the wuc register are cleared (0). the tco flexible filters are enabled for wakeup if ftfa0/1 and ftf0/1_en bits are set in the wuc register. field dword # address bit(s) initial value description mask0 0 0x9000 3:0 x mask for filter [3:0] for byte 0 mask1 2 0x9008 3:0 x mask for filter [3:0] for byte 1 mask2 4 0x9010 3:0 x mask for filter [3:0] for byte 2 ... mask127 254 0x93f8 3:0 x mask for filter [3:0] for byte 127
82574 gbe controller?driver programing interface 364 note: the initial values for this table can be loaded from the nvm after a power-up reset. or configured from smbus at pass-through mode. software has access to read from these registers. if software doesn?t write to these registers they remain in their original value. 10.2.8.18 flexible filter value table -ffvt (0x09800?0x09bf8; rw) the flexible filter value table register is used to store the one value for each byte location in a packet for each flexible filt er. if the corresponding mask bit is 1b, the flexible filter compares the incoming data byte to the values stored in this table. note: the table is organized to permit expansion to eight filters and 256 bytes in a future product without changing the address map. note: before writing to the flexible filter value table, the software device driver must first disable the flexible filters by writing 0?bs to the flexible filter enable bits of the wake up filter control (wufc.flxn) register. field dword address bit(s) initial value description filter 0 byte0 value 0 0x9400 7:0 x tco filter 0 byte 0 value filter 0 byte0 msk 0 0x9400 8 x tco filter 0 byte 0 mask filter 1 byte0 value 0 0x9400 16:9 x tco filter 1 byte 0 value filter 1 byte0 msk 0 0x9400 17 x tco filter 1 byte 0 mask filter 0 byte1 value 0 0x9408 7:0 x tco filter 0 byte 1 value filter 0 byte1 msk 0 0x9408 8 x tco filter 0 byte 1 mask filter 1 byte1 value 0 0x9408 16:9 x tco filter 1 byte 1 value filter 1 byte1 msk 0 0x9408 17 x tco filter 1 byte 1 mask ... filter 0 byte127 value 0 0x97f8 7:0 x tco filter 0 byte 127 value filter 0 byte127 msk 0 0x97f8 8 x tco filter 0 byte 127 mask filter 1 byte127 value 0 0x97f8 16:9 x tco filter 1 byte 127 value filter 1 byte127 msk 0 0x97f8 17 x tco filter 1 byte 127 mask field dword # address bit(s) initial value description value0 0 0x9800 15:0 x value for filter [3:0] for byte 0 value1 2 0x9808 15:0 x value for filter [3:0] for byte 1 value2 4 0x9810 15:0 x value for filter [3:0] for byte 2 ... value127 254 0x9bf8 15:0 x value for filter [3:0] for byte 127
365 driver programing interface?82574 gbe controller 10.2.9 time sync re gister descriptions 10.2.9.1 rx time sync control register - tsyncrxctl (offset 0b620; rw) 10.2.9.2 rx time stamp low - rxstmpl (offset 0b624; rw) 10.2.9.3 rx time stamp high - rxstmph (offset 0b628; rw) 10.2.9.4 rx time stamp attributes low - rxsatrl (offset 0b62c; rw) bit type reset description 0(ro/v) 0b rxtt rx time stamp valid. equals 1b when a valid value for rx time stamp is captured in the rx time stamp register; cleared by read of rx time stamp register rxstmph. 3:1 rw 0x0 ty p e type of packets to timestamp: 000b = time stamp l2 (v2) packet s only (sync or delay_req depends on message type in section 10.2.9.6 and packets with message id 2 and 3). 001b = time stamp l4 (v1) packet s only (sync or delay_req depends on message type in section 10.2.9.6 ). 010b = time stamp v2 (l2 and l4) packets (sync or delay_req depends on message type in section 10.2.9.6 and packets with message id 2 and 3). 100b = time stamp all packets (in this mode no locking is done to the value in the time stamp registers and no indications in receive descriptors are transferred). 101b = time stamp all packets whose message id bit 3 is zero, which means time stamp all event packets. this is applicable for v2 packets only. 011b, 110b and 111b = reserved. 4rw 0x0 en enable rx time stamp 0x0 = time stamping disabled. 0x1 = time stamping enabled. 31:4 ro 0x0 reserved bit type reset description 31:0 ro 0x0 rxstmpl rx time stamp lsb value. bit type reset description 31:0 ro 0x0 rxstmph rx time stamp msb value. bit type reset description 31:0 ro 0x0 sourceidl sourceuuid low the value of this register is in host order.
82574 gbe controller?driver programing interface 366 10.2.9.5 rx time stamp attributes high- rxsatrh (offset 0x0b630; rw) 10.2.9.6 rx ethertype and message type register - rxcfgl (offset 0b634; rw) 10.2.9.7 rx udp port - rxudp (offset 0x0b638; rw) 10.2.9.8 tx time sync control register - tsynctxctl (offset 0b614; rw) bit type reset description 15:0 ro 0x0 sourceidh sourceuuid high the value of this register is in host order. 31:16 ro 0x0 sequenceid sequencei the value of this register is in host order. bit type reset description 15:0 rw 0x88f7 ptp l2 ethertype to time stamp. the value of this register is programmed/read in network order. 23:16 rw 0x0 v1 control to time stamp. 31:24 rw 0x0 v2 messageid to time stamp. bit type reset description 15:0 rw 0x319 uport udp port number to time stamp. the value of this register is programmed/read in network order. 31:16 ro 0x0 reserved bit type reset description 0ro/v 0 txtt tx time stamp valid. equals 1b when a valid value for tx timestamp is captured in the tx time stamp register. cleared by read of tx time stamp register txstmph. 3:1 ro 0 reserved 4rw 0 en enable tx timestamp 0x0 = time stamping disabled. 0x1 = time stamping enabled. 31:5 ro 0 reserved
367 driver programing interface?82574 gbe controller 10.2.9.9 tx time stamp value low - txstmpl (offset 0b618; rw) 10.2.9.10 tx time stamp value high - txstmph (offset 0b61c; rw) 10.2.9.11 system time register lo w - systiml (offset 0b600; rw) 10.2.9.12 system time register hi gh - systimh (offset 0b604; rw) 10.2.9.13 increment attributes register - timinca (offset 0b608; rw) 10.2.9.14 time adjustment offset register low - timadjl (offset 0b60c; rw) bit type reset description 31:0 ro 0x0 txstmpl tx timestamp lsb value bit type reset description 31:0 ro 0x0 txstmph tx timestamp msb value bit type reset description 31:0 rw 0x0 stl system time lsb register. bit type reset description 31:0 rw 0x0 sth system time msb register. bit type reset description 23:0 rw 0x0 iv increment value ? incvalue. 31:24 rw 0x0 ip increment period ? incperiod. bit type reset description 31:0 rw 0x00 tadjl time adjustment value ? low.
82574 gbe controller?driver programing interface 368 10.2.9.15 time adjustment offset regist er high - timadjh (offset 0b610; rw) 10.2.10 msi-x register descriptions these registers are used to configure th e msi-x mechanism. the address and upper address registers set the address for each of the vectors. the message register sets the data sent to the relevant address. the vector control registers are used to enable specific vectors. the pending bit array register indicates which vectors have pending interrupts. the structure is listed in ta b l e 7 9 . table 79. msi-x table structure table 80. msi-x pba structure note: the table lists the general case. in the 82574 n = 5. as a result, only qword0 is implemented. bit type reset description 30:0 rw 0x00 tadjh time adjustment value - high. 31 rw 0x0 sign sign (?0?=?+?, ?1?=?-?) dword3 dword2 dword1 dword0 vector control msg data msg upper addr msg addr entry 0 base vector control msg data msg upper addr msg addr entry 1 base + 1*16 vector control msg data msg upper addr msg addr entry 2 base + 2*16 vector control msg data msg upper addr msg addr entry 3 base + 3*16 vector control msg data msg upper addr msg addr entry 4 base + 4*16 63:0 pending bits 0 through 63 qword0 base pending bits 64 through 127 qword1 base+1*8 ??? pending bits ((n-1) div 64)*64 through n-1 qword((n-1) div 64) base + ((n-1) div 64)*8
369 driver programing interface?82574 gbe controller 10.2.10.1 msi?x table entry lower address - msixtadd (bar3: 0x0000 + n*0x10 [n=0..4]; r/w) 10.2.10.2 msi?x table entry upper address - msixtuadd (bar3: 0x0004 + n*0x10 [n=0..4]; r/w) 10.2.10.3 msi?x table entry message - msixtmsg (bar3: 0x0008 + n*0x10 [n=0..4]; r/w) 10.2.10.4 msi?x table entry vector control - msixtvctrl (bar3: 0x000c + n*0x10 [n=0..4]; r/w) field bit(s) initial value description message address lsb (ro) 1:0 0x0 for proper dword alignment, software must always write 0b?s to these two bits. otherwise, the result is undefined. message address 31:2 0x0 system-specific message lower address for msi-x messages, the contents of this field from an msi-x table entry specifies the lower portion of the dword-aligned address for the memory write transaction. field bit(s) initial value description message address 31:0 0x0 system-specific message upper address field bit(s) initial value description message data 31:0 0x0 system-specific message data for msi-x messages, the contents of this field from an msi-x table entry specifies the data written during the memory write transaction. in contrast to message data used for msi messages, the low-order message data bits in msi-x mess ages are not modified by the function. field bit(s) initial value description mask 0 1b when this bit is set, the functi on is prohibited from sending a message using this msi-x table entry. however, any other msi-x table entries programmed with the same vector are still capable of sending an equivalent message unless they are also masked. reserved 31:1 0x0 reserved
82574 gbe controller?driver programing interface 370 10.2.10.5 msi-x pba bit description-msixpba (bar3: 0x02000; ro) 10.2.11 phy registers phy registers can be accessed by using mdic as described in section 10.2.2.7 table 81. 82574 phy register summary field bit(s) initial value description pending bits 4:0 0x0 for each pending bit that is set, the function has a pending message for the associated msi-x table entry. pending bits that have no associated msi-x table entry are reserved. reserved 31:5 0x0 reserved category offset alias offset abbreviation name rw link to page phy any page, register 0 control register page 372 phy any page, register 1 status register page 374 phy any page, register 2 phy identifier 1 page 374 phy any page, register 3 phy identifier 2 page 375 phy any page, register 4 auto-negotiation advertisement register page 375 phy any page, register 5 link partner ability register - base page page 377 phy any page, register 6 auto-negotiation expansion register page 378 phy any page, register 7 next page transmit register page 379 phy any page, register 8 link partner next page register page 379 phy any page, register 9 1000base-t control register page 380 phy any page, register 10 1000base-t status register page 381 phy any page, register 15 extended status register page 382 phy page 0, register 16 copper specific control register 1 page 382 phy page 0, register 17 copper specific status register 1 page 384 phy page 0, register 18 copper specific interrupt enable register page 385 phy page 0, register 19 copper specific status register 2 page 386 phy page 0, register 20 copper specific control register 3 page 387
371 driver programing interface?82574 gbe controller category offset alias offset abbreviation name rw link to page phy page 0, register 21 receive error counter register page 387 phy any page, register 22 page address page 388 phy page 0, register 25 oem bits page 388 phy page 0, register 26 copper specific co ntrol register 2 page 389 phy page 0, register 29 bias setting register 1 page 390 phy page 0, register 30 bias setting register 2 page 390 phy page 2, register 16 mac specific control register 1 page 390 phy page 2, register 18 mac specific interrupt enable register page 391 phy page 2, register 19 mac specific status register page 391 phy page 2, register 21 mac specific control register 2 page 392 phy page 3, register 16 led[3:0] function control register page 392 phy page 3, register 17 led[3:0] polarity control register page 395 phy page 3, register 18 led timer control register page 396 phy page 3, register 19 led[5:4] function control and polarity register page 397 phy page 5, register 20 1000 base-t pair skew register page 398 phy page 5, register 21 1000 base-t pair swap and polarity page 398 phy page 6, register 17 crc counters page 398
82574 gbe controller?driver programing interface 372 10.2.11.1 control register (any page), phy address 01; register 0 bits field mode hw rst sw rst description 15 reset r/w, sc 0x0 sc phy software reset. writing a 1b to this bit causes the phy state machines to be reset. when the reset operation completes, this bit is automatically cleared to 0b. the reset occurs immediately. 1b = phy reset. 0b = normal operation. 14 loopback r/w 0x0 0x0 when loopback is activated, the transmitter data presented on txd is looped back to rxd internally. the link is broken when loopback is enabled. loopback speed is determined by registers 21_2.2:0. 1b = enable loopback. 0b = disable loopback. 13 speed select (lsb) r/w 0x0 update changes to this bit are disruptive to the normal operation; therefore, any changes to these registers must be followed by a software reset to take effect. a write to this register bit does not take effect until any one of the following also occurs: ? software reset is asserted (register 0.15). ? restart auto-negotiation is asserted (register 0.9). ? power down (register 0.11, 16_0.2) transitions from power down to normal operation (bit 6, 13). 11b = reserved. 10b = 1000 mb/s. 01b = 100 mb/s. 00b = 10 mb/s. 12 auto- negotiation enable r/w 0x1 update changes to this bit are disruptive to the normal operation. a write to this register bit does not take effect until any one of the following occurs: ? software reset is asserted (register 0.15). ? restart auto-negotiation is asserted (register 0.9). ? power down (register 0.11, 16_0.2) transitions from power down to normal operation. if register 0.12 is set to 0b and speed is manually forced to 1000 mb/s in re gisters 0.13 and 0.6, then auto- negotiation is still enabled and only 1000base-t full-duplex is adve rtised if register 0.8 is set to 1b, and 1000base-t half-duplex is advertised if register 0.8 is set to 0b. registers 4.8:5 and 9.9:8 are ignored. auto-negotiation is mandatory per ieee for proper operation in 1000base-t. 1b = enable auto-negotiation process. 0b = disable auto-negotiation process.
373 driver programing interface?82574 gbe controller 11 power down r/w see description retain power down is controlled via register 0.11 and 16_0.2. both bits must be set to 0b before the phy transitions from power down to normal operation. when the port is switch ed from power down to normal operation, a software reset and restart auto-negotiation are performed even when bits reset (0_15) and restart auto-negotiation (0.9) are not set by the user. ieee power down shuts down the 82574 except for the gmii interface if 16_2.3 is set to 1b. if 16_2.3 is set to 0b, then the gmii interface also shuts down. after a hardware reset, this bit takes on the value of pd_pwrdn_a . 1b = power down. 0b = normal operation. when pd_pwrdn_a transitions from 1b to 0b this bit is set to 0b. when pd_pwrdn_a transitions from 0b to 1b this bit is set to 1b. 10 isolate ro 0x0 0x0 this bit has no effect. 9 restart copper auto- negotiation r/w,sc 0x0 sc when pd_aneg_now_a transitions from 0b to 1b this bit is set to 1b. auto-negotiation automatically restarts after hardware or software reset regardless of whether or not the restart bit (0.9) is set. 1b = restart auto-negotiation process. 0b = normal operation. 8 copper duplex mode r/w 0x1 update changes to this bit are disruptive to the normal operation; therefore, any changes to these registers must be followed by a software reset to take effect. a write to this register bit does not take effect until any one of the following also occurs: ? software reset is asserted (register 0.15). ? restart auto-negotiation is asserted (register 0.9). ? power down (register 0.11, 16_0.2) transitions from power down to normal operation. 1b = full-duplex. 0b = half-duplex. 7 collision te s t ro 0x0 0x0 this bit has no effect. 6 speed selection (msb) r/w 0x1 update changes to this bit are disruptive to the normal operation; therefore, any changes to these registers must be followed by a software reset to take effect. a write to this register bit does not take effect until any one of the following occurs: ? software reset is asserted (register 0.15). ? restart auto-negotiation is asserted (register 0.9). ? power down (register 0.11, 16_0.2) transitions from power down to normal operation (bit 6, 13). 11b = reserved. 10b = 1000 mb/s. 01b = 100 mb/s. 00b = 10 mb/s. 5:0 reserved ro always 0x0 always 0x0 reserved, always 0x0. bits field mode hw rst sw rst description
82574 gbe controller?driver programing interface 374 10.2.11.2 status register (any pa ge), phy address 01; register 1 10.2.11.3 phy identifier 1 (any page), phy address 01; register 2 bits field mode hw rst sw rst description 15 100base-t4 ro always 0b always 0b 100base-t4. this protocol is not available. 0b = phy not able to perform 100base-t4. 14 100base-x full- duplex ro always 1b always 1b 1b = phy able to perform full-duplex 100base-x. 13 100base-x half-duplex ro always 1b always 1b 1b = phy able to perfor m half-duplex 100base-x. 12 10 mbps full- duplex ro always 1b always 1b 1b = phy able to perform full-duplex 10base-t. 11 10 mbps half- duplex ro always 1b always 1b 1b = phy able to perfor m half-duplex 10base-t. 10 100base-t2 full-duplex ro always 0b always 0b this protocol is not available. 0b = phy not able to perform full-duplex. 9 100base-t2 half-duplex ro always 0b always 0b this protocol is not available. 0b = phy not able to perform half-duplex. 8 extended status ro always 1b always 1b 1b = extended status information in register 15. 7 reserved ro always 0b always 0b reserved, always 0b. 6 mf preamble suppression ro always 1b always 1b 1b = phy accepts management frames with preamble suppressed. 5 copper auto- negotiation complete ro 0x0 0x0 1b = auto-negotiation process complete. 0b = auto-negotiation process not complete. 4 copper remote fault ro, lh 0x0 0x0 1b = remote fault condition detected. 0b = remote fault condition not detected. 3 auto- negotiation ability ro always 1b always 1b 1b = phy able to perform auto-negotiation. 2 copper link status ro, ll 0x0 0x0 this register bit indicates when link was led[3] since the last read. for the current link status, either read this register back-to-back or read register 17_0.10 link real time . 1b = link is up. 0b = link is down. 1jabber detect ro, lh 0x0 0x0 1b = jabber condition detected. 0b = jabber condition not detected. 0 extended capability ro always 1b always 1b 1b = extended register capabilities. bits field mode hw rst sw rst description 15:0 organizationally unique identifier bit 3:18 ro 0x0141 0x0141 0x005043 0000 0000 0101 0000 0100 0011 ^ ^ bit 1....................................bit 24 register 2. [15:0] show bits 3 to 18 of the oui. 0000000101000001 ^ ^ bit 3...................bit18
375 driver programing interface?82574 gbe controller 10.2.11.4 phy identifier 2 (any page), phy address 01; register 3 10.2.11.5 auto-negotiation advertisement register (any page), phy address 01; register 4 bits field mode hw rst sw rst description 15:10 oui lsb ro always 000011b 0x00 organizationally unique identifier bits 19:24 00 0011 ^....... ..^ bit 19...bit 24 9:4 model number ro always 001011b 0x00 model number 001011b. 3:0 revision number ro see description see description rev number. contact faes for information on the device revision number. bits field mode hw rst sw rst description 15 next page r/w 0x0 update a write to this register bit does not take effect until any one of the following occurs: ? software reset is asserted (register 0.15). ? restart auto-negotiation is asserted (register 0.9). ? power down (register 0.11, 16_0.2) transitions from power down to normal operation. ? copper link goes down. if 1000base-t is advertised then the required next pages are automatically transmitted. register 4.15 should be set to 0b if no additional next pages are needed. 1b = advertise. 0b = not advertised. 14 ack ro always 0b always 0b reserved, must be 0b. 13 remote fault r/w 0x0 update a write to this register bit does not take effect until any one of the following occurs: ? software reset is asserted (register 0.15). ? restart auto-negotiation is asserted (register 0.9). ? power down (register 0.11, 16_0.2) transitions from power down to normal operation. ? copper link goes down. 1b = set remote fault bit. 0b = do not set remote fault bit. 12 reserved r/w 0x0 update a write to this register bit does not take effect until any one of the following occurs: ? software reset is asserted (register 0.15). ? restart auto-negotiation is asserted (register 0.9). ? power down (register 0.11, 16_0.2) transitions from power down to normal operation. ? copper link goes down. reserved bit is r/w to allow for forward compatibility with futu re ieee standards.
82574 gbe controller?driver programing interface 376 11 asymmetric pause r/w see description update a write to this register bit does not take effect until any one of the following occurs: ? software reset is asserted (register 0.15). ? restart auto-negotiation is asserted (register 0.9). ? power down (register 0.11, 16_0.2) transitions from power down to normal operation. ? copper link goes down. after a hardware reset, this bit takes on the value of pd_config_asm_pause_a . 1b = asymmetric pause. 0b = no asymmetric pause. 10 pause r/w see description update a write to this register bit does not take effect until any one of the following occurs: ? software reset is asserted (register 0.15). ? restart auto-negotiation is asserted (register 0.9). ? power down (register 0.11, 16_0.2) transitions from power down to normal operation. ? copper link goes down. after a hardware reset, this bit takes on the value of pd_config_pause_a . 1b = mac pause implemented. 0b = mac pause not implemented. 9 100base-t4 r/w 0x0 retain 0b = not capable of 100base-t4. 8 100base-tx full-duplex r/w 0x1 update a write to this register bit does not take effect until any one of the following occurs: ? software reset is asserted (register 0.15). ? restart auto-negotiation is asserted (register 0.9). ? power down (register 0.11, 16_0.2) transitions from power down to normal operation. ? copper link goes down. if register 0.12 is set to 0b and speed is manually forced to 1000 mb/s in registers 0.13 and 0.6, then auto-negotiation is still enabled and only 1000base- t full-duplex is advertised if register 0.8 is set to 1b; 1000base-t half-duplex is advertised if 0.8 is set to 0b. registers 4.8:5 and 9.9:8 are ignored. auto-negotiation is mandatory per ieee for proper operation in 1000base-t. 1b = advertise. 0b = not advertised. 7 100base-tx half-duplex r/w 0x1 update a write to this register bit does not take effect until any one of the following occurs: ? software reset is asserted (register 0.15) ? restart auto-negotiation is asserted (register 0.9) ? power down (register 0.11, 16_0.2) transitions from power down to normal operation ? copper link goes down. if register 0.12 is set to 0b and speed is manually forced to 1000 mb/s in registers 0.13 and 0.6, then auto-negotiation is still enabled and only 1000base- t full-duplex is advertised if register 0.8 is set to 1b; 1000base-t half-duplex is advertised if 0.8 is set to 0b. registers 4.8:5 and 9.9:8 are ignored. auto-negotiation is mandatory per ieee for proper operation in 1000base-t. 1b = advertise. 0b = not advertised. bits field mode hw rst sw rst description
377 driver programing interface?82574 gbe controller 10.2.11.6 link partner ability register - ba se page (any page), phy address 01; register 5 6 10base-tx full-duplex r/w 0x1 update a write to this register bit does not take effect until any one of the following occurs: ? software reset is asserted (register 0.15). ? restart auto-negotiation is asserted (register 0.9). ? power down (register 0.11, 16_0.2) transitions from power down to normal operation. ? copper link goes down. if register 0.12 is set to 0b and speed is manually forced to 1000 mb/s in regi sters 0.13 and 0.6, then auto-negotiation is still enabled and only 1000base- t full-duplex is advertised if register 0.8 is set to 1; 1000base-t half-duplex is adve rtised if 0.8 is set to 0b. registers 4.8:5 an d 9.9:8 are ignored. auto-negotiation is mandatory per ieee for proper operation in 1000base-t. 1b = advertise. 0b = not advertised. 5 10base-tx half-duplex r/w 0x1 update a write to this register bit does not take effect until any one of the following occurs: ? software reset is asserted (register 0.15). ? restart auto-negotiation is asserted (register 0.9). ? power down (register 0.11, 16_0.2) transitions from power down to normal operation. ? copper link goes down. if register 0.12 is set to 0b and speed is manually forced to 1000 mb/s in regi sters 0.13 and 0.6, then auto-negotiation is still enabled and only 1000base- t full-duplex is advertised if register 0.8 is set to 1b; 1000base-t half-duplex is adve rtised if 0.8 is set to 0b. registers 4.8:5 an d 9.9:8 are ignored. auto-negotiation is mandatory per ieee for proper operation in 1000base-t. 1b = advertise. 0b = not advertised. 4:0 selector field r/w 0x01 retain selector field mode 00001 = 802.3. bits field mode hw rst sw rst description 15 next page ro 0x0 0x0 received code word bit 15. 1b = link partner capable of next page. 0b = link partner not capable of next page. 14 acknowledge ro 0x0 0x0 acknowledge received code word bit 14. 1b = link partner received link code word. 0b = link partner does not have next page ability. 13 remote fault ro 0x0 0x0 remote fault received code word bit 13. 1b = link partner detected remote fault. 0b = link partner has not detected remote fault. 12 technology ability field ro 0x0 0x0 received code word bit 12. 11 asymmetric pause ro 0x0 0x0 received code word bit 11. 1b = link partner requests asymmetric pause. 0b = link partner does not request asymmetric pause. bits field mode hw rst sw rst description
82574 gbe controller?driver programing interface 378 10.2.11.7 auto-negotiation expansion register (any page), phy address 01; register 6 10 pause capable ro 0x0 0x0 received code word bit 10. 1b = link partner is capable of pause operation. 0b = link partner is not capable of pause operation. 9 100base-t4 capability ro 0x0 0x0 received code word bit 9. 1b = link partner is 100base-t4 capable. 0b = link partner is not 100base-t4 capable. 8 100base-tx full-duplex capability ro 0x0 0x0 received code word bit 8. 1b = link partner is 100base-tx full-duplex capable. 0b = link partner is not 100base-tx full-duplex capable. 7 100base-tx half-duplex capability ro 0x0 0x0 received code word bit 7. 1b = link partner is 100base-tx half-duplex capable. 0b = link partner is not 100base-tx half-duplex capable. 6 10base-t full-duplex capability ro 0x0 0x0 received code word bit 6. 1b = link partner is 10base-t full-duplex capable. 0b = link partner is not 10base-t full-duplex capable. 5 10base-t half-duplex capability ro 0x0 0x0 received code word bit 5. 1b = link partner is 10base-t half-duplex capable. 0b = link partner is not 10base-t half-duplex capable. 4:0 selector field ro 0x00 0x00 selector field received code word bit 4:0. bits field mode hw rst sw rst description 15:5 reserved ro 0x000 0x000 reserved. must be 00000000000. 4 parallel detection fault ro,lh 0x0 0x0 register 6.4 is not valid until the auto-negotiation complete bit (reg 1.5) indicates completed. 1b = a fault has been detected via the parallel detection function. 0b = a fault has not been detected via the parallel detection function. 3 link partner next page able ro 0x0 0x0 register 6.3 is not valid until the auto-negotiation complete bit (reg 1.5) indicates completed. 1b = link partner is next page able. 0b = link partner is not next page able. 2 local next page able ro 0x1 0x1 register 6.2 is not valid until the auto-negotiation complete bit (reg 1.5) indicates completed. 1b = local device is next page able. 0b = local device is not next page able. 1 page received ro, lh 0x0 0x0 register 6.1 is not valid until the auto-negotiation complete bit (reg 1.5) indicates completed. 1b = a new page has been received. 0b = a new page has not been received. 0 link partner auto- negotiation able ro 0x0 0x0 register 6.0 is not valid until the auto-negotiation complete bit (reg 1.5) indicates completed. 1b = link partner is auto-negotiation able. 0b = link partner is not auto-negotiation able. bits field mode hw rst sw rst description
379 driver programing interface?82574 gbe controller 10.2.11.8 next page transmit register (a ny page), phy address 01; register 7 10.2.11.9 link partner next page register (any page), phy address 01; register 8 bits field mode hw rst sw rst description 15 next page r/w 0x0 0x0 transmit code word bit 15. a write to register 7 implicitly sets a variable in the auto-negotiation state machine indicating that the next page has been loaded. a link failure clears register 7. 14 reserved ro 0x0 0x0 transmit code word bit 14. 13 message page mode r/w 0x1 0x1 transmit code word bit 13. 12 acknowledge2 r/w 0x0 0x0 transmit code word bit 12. 11 toggle ro 0x0 0x0 transmit code word bit 11. 10:0 message/ unformatted field r/w 0x001 0x001 transmit code word bit 10:0. bits field mode hw rst sw rst description 15 next page ro 0x0 0x0 received code word bit 15. 14 acknowledge ro 0x0 0x0 received code word bit 14. 13 message page ro 0x0 0x0 received code word bit 13. 12 acknowledge2 ro 0x0 0x0 received code word bit 12. 11 toggle ro 0x0 0x0 received code word bit 11. 10:0 message unformatted field ro 0x000 0x000 received code word bit 10:0.
82574 gbe controller?driver programing interface 380 10.2.11.10 1000base-t control register (any page), phy address 01; register 9 bits field mode hw rst sw rst description 15:13 test mode r/w 0x0 0x0 tx_clk comes from the rx_clk pin for jitter testing in test modes 2 and 3. after exiting the test mode, a hardware reset or software reset (register 0.15) should be issued to ensure normal operation. a restart of auto-negotiation clears these bits. 000b = normal mode. 001b = test mode 1 - transmit waveform test. 010b = test mode 2 - transmit jitter test (master mode). 011b = test mode 3 - transmit jitter test (slave mode). 100b = test mode 4 - transmit distortion test. 101b, 110b, 111b = reserved. 12 master/slave manual configuration enable r/w 0x0 update a write to this register bit does not take effect until any of the following also occurs: ? software reset is asserted (register 0.15). ? restart auto-negotiation is asserted (register 0.9). ? power down (regis ter 0.11, 16_0.2) transitions from power down to normal operation. ? copper link goes down. 1b = manual master/slave configuration. 0b = automatic master/slave configuration. 11 master/slave configuration value r/w see description update a write to this register bit does not take effect until any of the following also occurs: ? software reset is asserted (register 0.15). ? restart auto-negotiation is asserted (register 0.9). ? power down (regis ter 0.11, 16_0.2) transitions from power down to normal operation. ? copper link goes down. after a hardware reset, this bit takes on the value of pd_config_ms_a . 1b = manual configure as master. 0b = manual configure as slave. 10 port type r/w see description update a write to this register bit does not take effect until any of the following also occurs: ? software reset is asserted (register 0.15). ? restart auto-negotiation is asserted (register 0.9). ? power down (regis ter 0.11, 16_0.2) transitions from power down to normal operation. ? copper link goes down. register 9.10 is ignored if register 9.12 equals 1b. after a hardware reset, this bit takes on the value of pd_config_ms_a . 1b = prefer multi-port device (master). 0b = prefer single port device (slave).
381 driver programing interface?82574 gbe controller 10.2.11.11 1000base-t status register (a ny page), phy address 01; register 10 9 1000base-t full-duplex r/w 0x1 update a write to this register bit does not take effect until any of the following also occurs: ? software reset is asserted (register 0.15). ? restart auto-negotiation is asserted (register 0.9). ? power down (register 0.11, 16_0.2) transitions from power down to normal operation. ? copper link goes down. 1b = advertise. 0b = not advertised. 8 1000base-t half-duplex r/w see description update a write to this register bit does not take effect until any of the following also occurs: ? software reset is asserted (register 0.15). ? restart auto-negotiation is asserted (register 0.9). ? power down (register 0.11, 16_0.2) transitions from power down to normal operation. ? copper link goes down. after a hardware reset, this bit takes on the value of pd_config_1000hd_a . 1 = advertise. 0 = not advertised. 7:0 reserved r/w 0x00 retain reserved, set to 0x00. bits field mode hw rst sw rst description 15 master/slave configuration fault ro, lh 0x0 0x0 this register bit clears on reads. 1b = master/slave configuration fault detected. 0 = no master/slave configuration fault detected. 14 master/slave configuration resolution ro 0x0 0x0 1b = local phy configuration resolved to master. 0b = local phy configuration resolved to slave. 13 local receiver status ro 0x0 0x0 1b = local receiver operational. 0b = local receiver is not operational. 12 remote receiver status ro 0x0 0x0 1b = remote receiver operational. 0b = remote receiver not operational. 11 link partner 1000base-t full-duplex capability ro 0x0 0x0 1b = link partner is capable of 1000base-t full- duplex. 0b = link partner is not capable of 1000base-t full duplex. 10 link partner 1000base-t half-duplex capability ro 0x0 0x0 1b = link partner is capable of 1000base-t half- duplex. 0b = link partner is not capable of 1000base-t half duplex. 9:8 reserved ro 0x0 0x0 reserved. 7:0 idle error count ro, sc 0x00 0x00 msb of idle error counter. these register bits report the idle error count since the last time this register was read. the counter reaches its maximum at 11111111b and does not roll over. bits field mode hw rst sw rst description
82574 gbe controller?driver programing interface 382 10.2.11.12 extended status register (a ny page), phy address 01; register 15 10.2.11.13 copper specific control regist er 1 (page 0), phy address 01; register 16 bits field mode hw rst sw rst description 15 1000base-x full-duplex ro always 0b always 0b 0b = not 1000base-x full-duplex capable. 14 1000base-x half-duplex ro always 0b always 0b 0b = not 1000base-x half-duplex capable. 13 1000base-t full-duplex ro always 1b always 1b 1b =1000base-t full-duplex capable. 12 1000base-t half-duplex ro always 1b always 1b 1b =1000base-t half-duplex capable. 11:0 reserved ro 0x000 0x000 reserved, set to 0x000. bits field mode hw rst sw rst description 15 disable link pulses r/w 0x0 0x0 1b = disable link pulse. 0b = enable link pulse. 14:12 downshift counter r/w 0x3 update changes to these bits are disruptive to the normal operation; therefore, any changes to these registers must be followed by software reset to take effect. 1x, 2x,...8x is the number of times the phy attempts to establish gbe link before the phy downshifts to the next highest speed. 000b = 1x. 100b = 5x. 001b = 2x. 101b = 6x. 010b = 3x. 110b = 7x. 011b = 4x. 111b = 8x. 11 downshift enable r/w 0x0 update changes to these bits are disruptive to the normal operation; therefore, any changes to these registers must be followed by software reset to take effect. 1b = enable downshift. 0 = disable downshift. 10 force copper link good r/w 0x0 retain if link is forced to be go od, the link state machine is bypassed and the link is always up. in 1000base-t mode this has no effect. 1b = force link good. 0b = normal operation. 9:8 energy detect r/w see description update after a hardware reset, both bits take on the value of pd_config_edet_a . 0xb = off. 10b = sense only on receive (energy detect). 11b = sense and periodically transmit nlp (energy detect+tm). 7 enable extended distance r/w 0x0 retain when using a cable exceeding 100 meters, the 10base-t receive threshold must be lowered in order to detect incoming signals. 1b = lower 10base-t receive threshold. 0b = normal 10base-t receive threshold.
383 driver programing interface?82574 gbe controller 6:5 mdi crossover mode r/w 0x3 update changes to these bits are disruptive to the normal operation; therefore, any changes to these registers must be followed by a software reset to take effect. 00b = manual mdi configuration. 01b = manual mdix configuration. 10b = reserved. 11b = enable automatic crossover for all modes. 4 reserved r/w 0x0 retain reserved, write as 0x0. 3 copper transmitter disable r/w 0x0 retain 1b = transmitter disable. 0b = transmitter enable. 2 power down r/w 0x0 retain power down is controlled via register 0.11 and 16_0.2. both bits must be set to 0b before the phy transitions from power down to normal operation. when the port is switched from power down to normal operation, a software reset and restart auto- negotiation are done even when bits reset (0_15) and restart auto-negotiation (0.9) are not set by the user. ieee power down shuts down the 82574 except for the gmii interface if 16_2.3 is set to 1b. if 16_2.3 is set to 0b, then the gmii interface also shuts down. 1b = power down. 0b = normal operation. 1 polarity reversal disable r/w 0x0 retain if polarity is disabled, then the polarity is forced to be normal in 10base-t. 1b = polarity reversal disabled. 0b = polarity reversal enabled. the detected polarity status is shown in register 17_0.1 or in 1000base-t mode, 21_5.3:0. 0 disable jabber r/w 0x0 retain jabber has affect only in 10base-t half-duplex mode. 1b = disable jabber function. 0b = enable jabber function. bits field mode hw rst sw rst description
82574 gbe controller?driver programing interface 384 10.2.11.14 copper specific status regist er 1 (page 0), phy address 01; register 17 bits field mode hw rst sw rst description 15:14 speed ro 0x2 retain these status bits are valid only after resolved bit 17_0.11 equals 1b. the resolved bit is set when auto-negotiation completes or is disabled. 11b = reserved. 10b = 1000 mb/s. 01b = 100 mb/s. 00b = 10 mb/s. 13 duplex ro 0x0 retain this status bit is valid only after resolved bit 17_0.11 equals 1b. the resolved bit is set when auto- negotiation completes or is disabled. 1b = full-duplex. 0b = half-duplex. 12 page received ro, lh 0x0 0x0 1b = page received. 0b = page not received. 11 speed and duplex resolved ro 0x0 0x0 when auto-negotiation is not enabled 17_0.11 equals 1b. 1b = resolved. 0b = not resolved. 10 copper link (real time) ro 0x0 0x0 1b = link up. 0b = link down. 9 transmit pause enabled ro 0x0 0x0 this is a reflection of the mac pause resolution. this bit is for information purposes and is not used by the 82574. this status bit is valid only after resolved bit 17_0.11 = 1b. the resolved bit is set when auto- negotiation completes or is disabled. 1b = transmit pause enabled. 0b = transmit pause disable. 8 receive pause enabled ro 0x0 0x0 this is a reflection of the mac pause resolution. this bit is for information purposes and is not used by the 82574. this status bit is valid only after resolved bit 17_0.11 equals 1b. the resolved bit is set when auto-negotiation completes or is disabled. 1b = receive pause enabled. 0b = receive pause disabled. 7 reserved ro 0x0 0x0 reserved, set to 0x0. 6 mdi crossover status ro 0x1 retain this status bit is valid only after resolved bit 17_0.11 equals 1b. the resolved bit is set when auto- negotiation completes or is di sabled. this bit is 0b or 1b depending on what is written to 16.6:5 in manual configuration mode. register 16.6:5 are updated with a software reset. 1b = mdi-x. 0b = mdi. 5 downshift status ro 0x0 0x0 1b = downshift. 0b = no downshift. 4 copper energy detect status ro 0x0 0x0 1b = sleep. 0b = active. 3 global link status ro 0x0 0x0 1b = copper link is up. 0b = copper link is down.
385 driver programing interface?82574 gbe controller 10.2.11.15 copper specific in terrupt enable register (page 0), phy address 01; register 18 2 reserved ro 0x0 0x0 reserved, set to 0x0. 1 polarity (real time) ro 0x0 0x0 polarity reversal can be disabled by writing to register 16_0.1. in 1000base-t mode, polarity of all pairs are shown in register 21_5.3:0. 1b = reversed. 0b = normal. 0 jabber (real time) ro 0x0 0x0 1b = jabber. 0b = no jabber. bits field mode hw rst sw rst description 15 auto-negotiation error interrupt enable r/w 0x0 retain 1b = interrupt enable. 0b = interrupt disable. 14 speed changed interrupt enable r/w 0x0 retain 1b = interrupt enable. 0b = interrupt disable. 13 duplex changed interrupt enable r/w 0x0 retain 1b = interrupt enable. 0b = interrupt disable. 12 page received interrupt enable r/w 0x0 retain 1b = interrupt enable. 0b = interrupt disable. 11 auto-negotiation completed interrupt enable r/w 0x0 retain 1b = interrupt enable. 0b = interrupt disable. 10 link status changed interrupt enable r/w 0x0 retain 1b = interrupt enable. 0b = interrupt disable. 9 symbol error interrupt enable r/w 0x0 retain 1b = interrupt enable. 0b = interrupt disable. 8 false carrier interrupt enable r/w 0x0 retain 1b = interrupt enable. 0b = interrupt disable. 7 reserved r/w 0x0 retain reserved, set to 0x0. 6 mdi crossover changed interrupt enable r/w 0x0 retain 1b = interrupt enable. 0b = interrupt disable. 5 downshift interrupt enable r/w 0x0 retain 1b = interrupt enable. 0b = interrupt disable. 4 energy detect interrupt enable r/w 0x0 retain 1b = interrupt enable. 0b = interrupt disable. 3 flp exchange complete but no link interrupt enable r/w 0x0 retain 1b = interrupt enable. 0b = interrupt disable. 2 reserved r/w 0x0 retain reserved, set to 0x0. 1 polarity changed interrupt enable r/w 0x0 retain 1b = interrupt enable. 0b = interrupt disable. 0 jabber interrupt enable r/w 0x0 retain 1b = interrupt enable. 0b = interrupt disable. bits field mode hw rst sw rst description
82574 gbe controller?driver programing interface 386 10.2.11.16 copper specific status regist er 2 (page 0), phy address 01; register 19 bits field mode hw rst sw rst description 15 copper auto-negotiation error ro,lh 0x0 0x0 an error occurs if the master/slave is not resolved, parallel detect fault, no common hcd, or the link does not come up after negotiation completes. 1b = auto-negotiation error. 0b = no auto-negotiation error. 14 copper speed changed ro,lh 0x0 0x0 1b = speed changed. 0b = speed not changed. 13 copper duplex changed ro,lh 0x0 0x0 1b = duplex changed. 0b = duplex not changed. 12 copper page received ro,lh 0x0 0x0 1b = page received. 0b = page not received. 11 copper auto-negotiation completed ro,lh 0x0 0x0 1b = auto-negotiation completed. 0b = auto-negotiation not completed. 10 copper link status changed ro,lh 0x0 0x0 1b = link status changed. 0b = link status not changed. 9 copper symbol error ro,lh 0x0 0x0 1b = symbol error. 0b = no symbol error. 8 copper false carrier ro,lh 0x0 0x0 1b = false carrier. 0b = no false carrier. 7 reserved ro always 0b always 0b reserved, always set to 0b. 6 mdi crossover changed ro,lh 0x0 0x0 1b = crossover changed. 0b = crossover not changed. 5 downshift interrupt ro,lh 0x0 0x0 1b = downshift detected. 0b = no downshift. 4 energy detect changed ro,lh 0x0 0x0 1b = energy detect state changed. 0b = no energy detect state change detected. 3 flp exchange complete but no link ro,lh 0x0 0x0 1b = flp exchange completed but link not established. 0b = no event detected. 2 reserved ro 0x0 0x0 reserved, set to 0x0. 1 polarity changed ro,lh 0x0 0x0 1b = polarity changed. 0b = polarity not changed. 0jabber ro,lh0x0 0x0 1b = jabber. 0b = no jabber.
387 driver programing interface?82574 gbe controller 10.2.11.17 copper specific control register 3 (page 0), phy address 01; register 20 10.2.11.18 receive error counter register (page 0), phy address 01; register 21 bits field mode hw rst sw rst description 15:4 reserved r/w 0x000 retain reserved, write as all zeros. 3 reverse mdi_plus/ mdi_minus[3] tra n sm i t po la r i ty r/w 0x0 retain 0b = normal transmit polarity. 1b = reverse transmit polarity. 2 reverse mdi_plus/ mdi_minus[2] tra n sm i t po la r i ty r/w 0x0 retain 0b = normal transmit polarity. 1b = reverse transmit polarity. 1 reverse mdi_plus/ mdi_minus[1] tra n sm i t po la r i ty r/w 0x0 retain 0b = normal transmit polarity. 1b = reverse transmit polarity. 0 reverse mdi_plus/ mdi_minus[0] tra n sm i t po la r i ty r/w 0x0 retain 0b = normal transmit polarity. 1b = reverse transmit polarity. bits field mode hw rst sw rst description 15:0 receive error count ro, lh 0x0000 retain counter reaches its maximum at 0xffff and does not roll over. both false carrier and symbol errors are reported.
82574 gbe controller?driver programing interface 388 10.2.11.19 page address (any page ), phy address 01; register 22 10.2.11.20 oem bits (page 0), phy address 01; register 25 bits field mode hw rst sw rst description 15:8 reserved ro always 0x00 always 0x00 reserved, always set to 0x00. 7:0 page select for registers 0 to 28 r/w 0x00 retain page number. bits field mode hw rst sw rst description 15:11 reserved r/w 0x0 0x0 reserved, set to 0x0. 10 aneg_now r/w 0b 0b restart auto-negotiation. note that this bit is self clearing. 9:7 reserved r/w 0x0 0x0 reserved, set to 0x0. 6 a1000_dis r/w 0b retain gbe disable. 5:3 reserved r/w 0x0 0x0 reserved, set to 0x0. 2 rev_aneg r/w 0b retain lplu. 1:0 reserved r/w 0x0 0x0 reserved, set to 0x0.
389 driver programing interface?82574 gbe controller 10.2.11.21 copper specific control register 2 (page 0), phy address 01; register 26 bits field mode hw rst sw rst description 15 1000 base-t tra ns m i t t er ty p e r/w 0x0 retain 0b = class b. 1b = class a. 14 disable 1000base-t r/w see description retain when set to disabled, 1000base-t is not advertised even if registers 9.9 or 9.8 are set to 1b. a write to this register bit does not take effect until any one of the following occurs: ? software reset is asserted (register 0.15). ? restart auto-negotiation is asserted (register 0.9). ? power down (register 0.11, 16_0.2) transitions from power down to normal operation. ? copper link goes down. after a hardware reset, this bit defaults as follows: ? ps_a1000_dis_s - bit 26_0.14 - 0, 0, 1, 1. ? when ps_a1000_dis_s transitions from one to zero, this bit is set to 0b. ? when ps_a1000_dis_s transitions from zero to one, this bit is set to 1b. 1b = disable 1000base-t advertisement. 0b = enable 1000base-t advertisement. 13 reverse autoneg r/w see description retain a write to this register bit does not take effect until any one of the following occurs: ? software reset is asserted (register 0.15). ? restart auto-negotiation is asserted (register 0.9). ? power down (register 0.11, 16_0.2) transitions from power down to normal operation. ? copper link goes down. after a hardware reset, this bit defaults as follows: ? pd_rev_aneg_a - bit 26_0.13 - 0, 0, 1, 1. ? when pd_rev_aneg_a transitions from one to zero this bit will be set to 0b. ? when pd_rev_aneg_a transitions from zero to one this bit will be set to 1b. 1b = reverse auto-negotiation. 0b = normal auto-negotiation. 12 100 base-t tra ns m i t t er ty p e r/w 0x0 retain 0b = class b. 1b = class a. 11:4 reserved r/w 0x00 retain reserved, write as 0x00. 3:2 100 mb test select r/w 0x0 retain 0xb = normal operation. 10b = select 112 ns sequence. 11b = select 16 ns sequence. 1 10 bt polarity force r/w 0x0 retain 1b = force negative polarity for receive only. 0b = normal operation. 0 reserved r/w 0x0 retain reserved, write as 0x0.
82574 gbe controller?driver programing interface 390 10.2.11.22 bias setting register 1 (page 0), phy address 01; register 29 10.2.11.23 bias setting register 2 (page 0), phy address 01; register 30 10.2.11.24 mac specific control register 1 (page 2), phy address 01; register 16 bits field mode hw rst sw rst description 15:0 bias setting1 r/w retain used to optimize phy performance in 1000base-t mode. set to 0x0003 when initializing the 82574 to improve ber performance. bits field mode hw rst sw rst description 15:0 bias setting2 r/w retain used to optimize phy performance in 1000base-t mode. set to 0x0000 when initializing the 82574 to improve ber performance. bits field mode hw rst sw rst description 15:14 tra n sm i t fifo depth r/w 0x0 retain 1000base-t: 00b = 16 bits. 01b = 24 bits. 10b = 32 bits. 11b = 40 bits. 13:10 reserved r/w 0x00 retain reserved, set to 0x00. 9 disable fi_125_clk r/w see description retain changes to this bit are disruptive to the normal operation; therefore, any changes to these registers must be followed by a software reset to take effect. after a hardware reset, this bit takes on the value of pd_pwrdn_clk125_a . when pd_pwrdn_clk125_a transitions from one to zero this bit is set to 0b. when pd_pwrdn_clk125_a transitions from zero to one this bit is set to 1b. 1b = fi_125_clk low. 0b = fi_125_clk toggle 8 disable fi_50_clk r/w see description retain after a hardware reset, this bit takes on the value of pd_pwrdn_clk50_a . when pd_pwrdn_clk50_a transitions from one to zero this bit is set to 0b. when pd_pwrdn_clk50_a transitions from zero to one this bit is set to 1b. 1b = fi_50_clk low. 0b = fi_50_clk toggle. 7 reserved r/w 0x1 update reserved, write as 0x1. 6:4 reserved r/w 0x0 retain reserved, write as 0x00. 3 gmii interface power down r/w 0x1 update changes to this bit are disruptive to the normal operation; therefore, any changes to these registers must be followed by a software reset to take effect. this bit determines whether the gmii rx_clk powers down when register 0.11, 16_0.2 are used to power down the 82574 or when the phy enters the energy detect state. 1b = always power up. 0b = can power down. 2:0 reserved r/w 0x0 retain reserved, write as 0x00.
391 driver programing interface?82574 gbe controller 10.2.11.25 mac specific interrupt enable register (page 2), phy address 01; register 18 10.2.11.26 mac specific status register (page 2), phy address 01; register 19 bits field mode hw rst sw rst description 15:8 reserved r/w 0x00 retain reserved, set to 0x00. 7 fifo over/ underflow interrupt enable r/w 0x0 retain 1b = interrupt enable. 0b = interrupt disable. 6:4 reserved r/w 0x0 retain reserved, set to 0x0. 3 fifo idle inserted interrupt enable r/w 0x0 retain 1b = interrupt enable. 0b = interrupt disable. 2 fifo idle deleted interrupt enable r/w 0x0 retain 1b = interrupt enable. 0b = interrupt disable. 1:0 reserved r/w 0x0 retain reserved, set to 0x0. bits field mode hw rst sw rst description 15:8 reserved ro always 0x00 always 0x00 reserved, always set to 0x00. 7 fifo over/ underflow ro,lh 0x0 0x0 1b = over/underflow error. 0b = no fifo error. 6:4 reserved ro always 0x0 always 0x0 reserved, always set to 0x0. 3 fifo idle inserted ro,lh 0x0 0x0 1b = idle inserted. 0b = no idle inserted. 2 fifo idle deleted ro,lh 0x0 0x0 1b = idle deleted. 0b = idle not deleted. 1:0 reserved ro always 0x0 always 0x0 reserved, always set to 0x0.
82574 gbe controller?driver programing interface 392 10.2.11.27 mac specific control register 2 (page 2), phy address 01; register 21 10.2.11.28 led[3:0] function control register (page 3), phy address 01; register 16 bits field mode hw rst sw rst description 15:14 reserved r/w 0x0 0x0 reserved, set to 0x0. 13:12 reserved r/w 0x1 update reserved, set to 0x1. 11:7 reserved r/w 0x00 0x00 reserved, set to 0x00. 6 reserved r/w 0x1 update reserved, set to 0x1. 5:4 reserved r/w 0x0 retain reserved, set to 0x0. 3 block carrier extension bit r/w 0x0 retain 1b = enable block carrier extension. 0b = disable block carrier extension. 2:0 default mac interface speed r/w 0x6 update changes to these bits are disruptive to the normal operation; therefore, any changes to these registers must be followed by software reset to take effect. mac interface speed during link down while auto- negotiation is enabled and tx_clk speed bit speed link down 1000base-t. 000b = 10 mb/s 2.5 mhz 0 mhz. 001b = 100 mb/s 25 mhz 0 mhz. 01xb = 1000 mb/s 0 mhz 0 mhz. 100b = 10 mb/s 2.5 mhz 2.5 mhz. 101b = 100 mb/s 25 mhz 25 mhz. 110b = 1000 mb/s 2.5 mhz 2.5 mhz. 111b = 1000 mb/s 25 mhz 25 mhz. bits field mode hw rst sw rst description 15:12 led[3] control r/w see description retain if 16_3.11:10 is set to 11b, then 16_3.15:12 has no effect. 0000b = reserved. 0001b = on - link, blink - activity, off - no link. 0010b = on - link, blink - receive, off - no link. 0011b = on - activity, off - no activity 0100b = blink - activity, off - no activity. 0101b = on - transmit, off - no transmit. 0110b = on - 10 mb/s or 1000 mb/s master, off. else 0111b = on - full duplex, off - half-duplex. 1000b = force off. 1001b = force on. 1010b = force hi-z. 1011b = force blink. 11xxb = reserved. after a hardware reset, this bit is a function of pd_config_led_a[1:0] . 00b = 0001b. 01b = 0001b. 10b = 0111b. 11b = 0001b.
393 driver programing interface?82574 gbe controller bits field mode hw rst sw rst description 11:8 led[2] control r/w see description retain 0000b = on - link, off - no link. 0001b = on - link, blink - activity, off - no link. 0010b = reserved. 0011b = on - activity, off - no activity. 0100b = blink - activity, off - no activity. 0101b = on - transmit, off - no transmit. 0110b = on - 10/1000 mb/s link, off. else 0111b = on - 10 mb/s link, off. else 1000b = force off. 1001b = force on. 1010b = force hi-z. 1011b = force blink. 1100b = mode 1 (dual led mode). 1101b = mode 2 (dual led mode). 1110b = mode 3 (dual led mode). 1111b = mode 4 (dual led mode). after a hardware reset, this bit is a function of pd_config_led_a[1:0] . 00b = 0000b. 01b = 0111b. 10b = 0001b. 11b = 0111b. 7:4 led[1] control r/w see description retain if 16_3.3:2 is set to 11b, then 16_3.7:4 has no effect. 0000b = reserved. 0001b = on - link, blink - activity, off - no link. 0010b = on - link, blink - receive, off - no link. 0011b = on - activity, off - no activity. 0100b = blink - activity, off - no activity. 0101b = reserved. 0110b = on - 100/1000 mb/s link, off. else 0111b = on - 100 mb/s link, off. else 1000b = force off. 1001b = force on. 1010b = force hi-z. 1011b = force blink. 11xxb = reserved. after a hardware reset, this bit is a function of pd_config_led_a[1:0] . 00b = 0001b. 01b = 0111b. 10b = 0111b. 11b = 0111b.
82574 gbe controller?driver programing interface 394 bits field mode hw rst sw rst description 3:0 led[0] control r/w see description retain 0000b = on - link, off - no link. 0001b = on - link, blink - activity, off - no link. 0010b = 3 blinks - 1000 mb/s 2 blinks - 100 mb/s 1 blink - 10 mb/s 0 blink - no link. 0011b = on - activity, off - no activity. 0100b = blink - activity, off - no activity. 0101b = on - transmit, off - no transmit. 0110b = on - copper link, off. else 0111b = on - 1000 mb/s link, off. else 1000b = force off. 1001b = force on. 1010b = force hi-z. 1011b = force blink. 1100b = mode 1 (dual led mode). 1101b = mode 2 (dual led mode). 1110b = mode 3 (dual led mode). 1111b = mode 4 (dual led mode). after a hardware reset this bit is a function of pd_config_led_a[1:0] . 00b = 1110b. 01b = 0111b. 10b = 0111b. 11b = 0111b.
395 driver programing interface?82574 gbe controller 10.2.11.29 led[3:0] polarity control register (page 3), phy address 01; register 17 bits field mode hw rst sw rst description 15:12 led[5], led[3], led[1] mix percentage r/w see description retain when using two-terminal bi-color leds, the mixing percentage should not be set greater than 50%. 0000b = 0%. 0001b = 12.5%. 0111b = 87.5%. 1000b = 100%. 1001b - 1111b = reserved. after a hardware reset, this bit is a function of pd_config_led_a[1:0] . 00b = 0100b. 01b = 0100b. 10b = 1000b. 11b = 1000b. 11:8 led[4], led[2], led[0] mix percentage r/w see description retain when using two-terminal bi-color leds, the mixing percentage should not be set greater than 50%. 0000b = 0%. 0001b = 12.5%. 0111b = 87.5%. 1000b = 100%. 1001b - 1111b = reserved. after a hardware reset, this bit is a function of pd_config_led_a[1:0] . 00b = 0100b. 01b = 0100b. 10b = 1000b. 11b = 1000b. 7:6 led[3] polarity r/w 0x0 retain 00b = on - drive led[3] low, off - drive led[3] high. 01b = on - drive led[3] high, off - drive led[3] low. 10b = on - drive led[3] low, off - tristate led[3] 11b = on - drive led[3] hi gh, off - tristate led[3] 5:4 led[2] polarity r/w 0x0 retain 00b = on - drive led[2] low, off - drive led[2] high. 01b = on - drive led[2] high, off - drive led[2] low. 10b = on - drive led[2] lo w, off - tristate led[2]. 11b = on - drive led[2] hi gh, off - tristate led[2]. 3:2 led[1] polarity r/w 0x0 retain 00b = on - drive led[1] low, off - drive led[1] high. 01b = on - drive led[1] high, off - drive led[1] low. 10b = on - drive led[1] lo w, off - tristate led[1]. 11b = on - drive led[1] hi gh, off - tristate led[1]. 1:0 led[0] polarity r/w 0x0 retain 00b = on - drive led[0] low, off - drive led[0] high. 01b = on - drive led[0] high, off - drive led[0] low. 10b = on - drive led[0] lo w, off - tristate led[0]. 11b = on - drive led[0] hi gh, off - tristate led[0].
82574 gbe controller?driver programing interface 396 10.2.11.30 led timer control register (page 3), phy address 01; register 18 bits field mode hw rst sw rst description 15 force int r/w 0x0 retain 1b = interrupt pin asserted is forced. 0b = normal operation. 14:12 pulse stretch duration r/w 0x4 retain 000b = no pulse stretching. 001b = 21 ms to 42 ms. 010b = 42 ms to 84 ms. 011b = 84 ms to 170 ms. 100b = 170 ms to 340 ms. 101b = 340 ms to 670 ms. 110b = 670 ms to 1.3 s. 111b = 1.3 s to 2.7 s 11 interrupt polarity r/w see description retain after a hardware reset, this bit takes on the value of pd_config_intpol_a . 0b = jt_int_s active high. 1b = jt_int_a active low 10:8 blink rate r/w see description retain 000b = 42 ms. 001b = 84 ms. 010b = 170 ms. 011b = 340 ms. 100b = 670 ms. 101b to 111b = reserved. after a hardware reset, this bit is a function of pd_config_led_a[1:0] . 00b = 001b. 01b = 000b. 10b = 001b. 11b = 001b. 7:4 reserved r/w 0x0 retain reserved, set to 0x0. 3:2 speed off pulse period r/w 0x1 retain 00b = 84 ms. 01b = 170 ms. 10b = 340 ms. 11b = 670 ms. 1:0 speed on pulse period r/w 0x1 retain 00b = 84 ms. 01b = 170 ms. 10b = 340 ms. 11b = 670 ms.
397 driver programing interface?82574 gbe controller 10.2.11.31 led[5:4] function control and polarity register (page 3), phy address 01; register 19 bits field mode hw rst sw rst description 15:12 reserved r/w 0x0 retain reserved, set to 0x0. 11:10 led[5] polarity r/w 0x0 retain 00b = on - drive led[5] low, off - drive led[5] high. 01b = on - drive led[5] high, off - drive led[5] low. 10b = on - drive led[5] low, off - tristate led[5]. 11b = on - drive led[5] high, off - tristate led[5]. 9:8 led[4] polarity r/w 0x0 retain 00b = on - drive led[4] low, off - drive led[4] high. 01b = on - drive led[4] high, off - drive led[4] low. 10b = on - drive led[4] low, off - tristate led[4]. 11b = on - drive led[4] high, off - tristate led[4]. 7:4 led[5] control r/w see description retain if 19_3.3:2 is set to 11b, then 19_3.7:4 has no effect. 0000b = on - receive, off - no receive. 0001b = on - link, blink - activity, off - no link. 0010b = on - link, blink - receive, off - no link. 0011b = on - activity, off - no activity. 0100b = blink - activity, off - no activity. 0101b = on - transmit, off - no transmit. 0110b = on - full-duplex, off - half-duplex. 0111b = on - full-duplex, blink - collision off - half duplex. 1000b = force off. 1001b = force on. 1010b = force hi-z. 1011b = force blink. 11xxb = reserved. after a hardware reset, this bit is a function of pd_config_led_a[1:0] . 00b = 0111b. 01b = 0100b. 10b = 0111b. 11b = 0111b. 3:0 led[4] control r/w see description retain 0000b = on - receive, off - no receive. 0001b = on - link, blink - activity, off - no link. 0010b = on - link, blink - receive, off - no link. 0011b = on - activity, off - no activity. 0100b = blink - activity, off - no activity. 0101b = on - transmit, off - no transmit. 0110b = on - full-duplex, off - half-duplex. 0111b = on - full-duplex, blink - collision off - half duplex. 1000b = force off. 1001b = force on. 1010b = force hi-z. 1011b = force blink. 1100b = mode 1 (dual led mode). 1101b = mode 2 (dual led mode). 1110b = mode 3 (dual led mode). 1111b = mode 4 (dual led mode). after a hardware reset, this bit is a function of pd_config_led_a[1:0] . 00b = 0011b. 01b = 0110b. 10b = 0011b. 11b = 0011b.
82574 gbe controller?driver programing interface 398 10.2.11.32 1000 base-t pair skew register (page 5), phy address 01; register 20 10.2.11.33 1000 base-t pair swap an d polarity (page 5), phy address 01; register 21 10.2.11.34 crc counters (page 6), phy address 01; register 17 bits field mode hw rst sw rst description 15:12 pair 7,8 (mdi[3]) ro 0x0 0x0 skew = bit value times 8 ns. the value is correct to within 8 ns. the contents of 20_5.15:0 are valid only if register 21_5.6 = 1b. 11:8 pair 4,5 (mdi[2]) ro 0x0 0x0 skew = bit value times 8 ns. the value is correct to within 8 ns. 7:4 pair 3,6 (mdi[1]) ro 0x0 0x0 skew = bit value times 8 ns. the value is correct to within 8 ns. 3:0 pair 1,2 (mdi[0]) ro 0x0 0x0 skew = bit value times 8 ns. the value is correct to within 8 ns. bits field mode hw rst sw rst description 15:7 reserved ro 0x000 0x000 6 register 20_5 and 21_5 valid ro 0x0 0x0 the contents of 21_5.5:0 and 20_5.15:0 are valid only if register 21_5.6 = 1b. 1b = valid. 0b = invalid. 5c, d crossoverro0x0 0x0 1b = channel c received on mdi[2] channel d received on mdi[3]. 0b = channel d received on mdi[2] channel c received on mdi[3]. 4 a, b crossover ro 0x0 0x0 1b = channel a received on mdi[0] channel b received on mdi[1]. 0b = channel b received on mdi[0] channel a received on mdi[1]. 3 pair 7,8 (mdi[3]) polarity ro 0x0 0x0 1b = negative. 0b = positive. 2 pair 4,5 (mdi[2]) polarity 0x0 0x0 1b = negative. 0b = positive. 1 pair 3,6 (mdi[1]) polarity ro 0x0 0x0 1b = negative. 0b = positive. 0 pair 1,2 (mdi[0]) polarity ro 0x0 0x0 1b = negative. 0b = positive. bits field mode hw rst sw rst description 15:8 crc packet count ro 0x00 retain 0x00 = no packets received. 0xff = 256 packets received (maximum count). bit 16_6.4 must be set to 1b in order for the register to be valid. 7:0 crc error count ro 0x00 retain 0x00 = no crc errors detected in the packets received. 0xff = 256 crc errors detected in the packets received (maximum count). bit 16_6.4 must be set to 1b in order for the register to be valid.
399 driver programing interface?82574 gbe controller 10.2.12 diagnostic register descriptions the 82574 contains several diagnostic regist ers. these registers enable software to directly access the contents of the 82574?s in ternal packet buffer memory (pbm), also referred to as fifo space. these registers also give software visibility into what locations in the pbm the hardware currently considers to be the head and tail for both transmit and receive operations. 10.2.12.1 phy oem bits register - poemb (0x00f10; rw) the bits in this register are connected to the phy interface. they affect the auto- negotiation speed resolution and enable gbe mode. additionally, phy class a or b drivers are also controlled. note: when software changes lplu, d0lplu or an1000_dis_nd0a it must wait at least 80 ns and then force the link to auto-negotiate in order to commit the changes to the phy . 10.2.12.2 receive data fifo head register - rdfh (0x02410; rw) field bit(s) initial value description reserved 0 1b 1 1. bits 7:0 of this register are loaded from nvm word 0x1c[15:8]. reserved d0lplu 1 0b 1 phy auto negotiation for slowes t possible link (reverse auto- negotiation) in all power states. this bit overrides the lplu bit. lplu 2 1b 1 enables phy auto-negotiation for slow est possible link (reverse auto- negotiation) in all power states except d0a (dr, d0u and d3). an1000_dis_n d0a 31b 1 prevents phy from auto negotiatin g 1000 mb/s link in all power states except d0a (dr, d0u and d3). class_ab 4 0b 1 class ab driver. reautoneg_ now 50b 1 this bit can be written by software to force link auto re-negotiation. 1000_dis 6 0b 1 prevents phy auto-negotiating 1000 mb/s link in all power states. auto_update 7 0b 1 auto-update cb disable auto update of the flash from the shadow ram when the er_rd register is written. pause 8 1b controls the pause advertisements by the phy. 1b = mac pause implemented. 0b = mac pause not implemented. asymmetric pause 91b controls the metric pause advertisement by the phy. 1b = asymmetric pause supported. 0b = semantics pause not supported. reserved 31:10 0x0 reserved field bit(s) initial value description fifo head 12:0 0x0 receive fifo head pointer reserved 31:13 0x0 reads as 0x0. should be written to 0x0 for future compatibility.
82574 gbe controller?driver programing interface 400 this register stores the head pointer of the on?chip receive data fifo. since the internal fifo is organized in units of 64-bit words, this field contains the 64-bit offset of the current receive fifo head. so a value of 0x8 in this register corresponds to an offset of eight qwords or 64 bytes into the re ceive fifo space. this register is available for diagnostic purposes only, and should not be written during normal operation. note: this register?s address has been moved from where it was located in previous devices. however, for backwards compatibility, this register can also be accessed at its alias offset of 0x08000. in addition, with the 82574, the value in this register contains the offset of the receive fifo head relative to the beginning of the entire pbm space. alternatively, with previous devices, the value in this register contains the relative offset to the beginning of the receive fifo space (within the pbm space). 10.2.12.3 receive data fifo tail register - rdft (0x02418; rw) this register stores the tail pointer of the on?chip receive data fifo. since the internal fifo is organized in units of 64 bit words, this field contains the 64 bit offset of the current receive fifo tail. so a value of ?0x8? in this register corresponds to an offset of 8 qwords or 64 bytes into the receive fifo space. this register is available for diagnostic purposes only, and should not be written during normal operation. note: this register?s address has been moved from where it was located in previous devices. however, for backwards compatibility, this register can also be accessed at its alias offset of 0x08008. in addition, with the 82574, the value in this register contains the offset of the receive fifo tail relative to the beginning of the entire pbm space. alternatively, with previous devices, the value in this register contains the relative offset to the beginning of the receive fifo space (within the pbm space). 10.2.12.4 receive data fifo head sa ved register - rdfhs (0x02420; rw) this register stores a copy of the receive data fifo head register if the internal register needs to be restored. this register is available for diagnostic purposes only, and should not be written during normal operation. 10.2.12.5 receive data fifo tail saved register - rdfts (0x02428; rw) field bit(s) initial value description fifo tail 12:0 0x0 receive fifo tail pointer. reserved 31:13 0x0 reads as 0x0. should be written to 0x0 for future compatibility. field bit(s) initial value description fifo head 12:0 0x0 a saved value of the receive fifo head pointer. reserved 31:13 0x0 reads as 0x0. should be written to 0x0 for future compatibility. field bit(s) initial value description fifo tail 12:0 0x0 a saved value of the receive fifo tail pointer. reserved 31:13 0x0 reads as 0x0. should be written to 0x0 for future compatibility.
401 driver programing interface?82574 gbe controller this register stores a copy of the receive data fifo tail register if the internal register needs to be restored. this register is availa ble for diagnostic purposes only, and should not be written during normal operation. 10.2.12.6 receive data fifo packet count - rdfpc (0x02430; rw) this register reflects the number of receive packets that are currently in the receive fifo. this register is available for diagnostic purposes only, and should not be written during normal operation. 10.2.12.7 transmit data fifo head register - tdfh (0x03410; rw) this register stores the head pointer of the on?chip transmit data fifo. since the internal fifo is organized in units of 64-bit wo rds, this field contains the 64-bit offset of the current transmit fifo head. so a value of 0x8 in this register corresponds to an offset of eight qwords or 64 bytes into the transmit fifo space. this register is available for diagnostic purposes only, an d should not be written during normal operation. note: this register?s address has been moved fr om where it was located in the previous devices. however, for backwards compatibility, this register can also be accessed at its alias offset of 0x08010. in addition, with th e 82574, the value in this register contains the offset of the transmit fifo head relative to the beginning of the entire pbm space. alternatively, with the previous devices, the value in this register contains the relative offset to the beginning of the transm it fifo space (within the pbm space). 10.2.12.8 transmit data fifo tail register - tdft (0x03418; rw) this register stores the head pointer of the on?chip transmit data fifo. since the internal fifo is organized in units of 64 bit wo rds, this field contains the 64 bit offset of the current transmit fifo tail. so a value of ?0x8? in this register corresponds to an offset of 8 qwords or 64 bytes into the transm it fifo space. this register is available for diagnostic purposes only, and should not be written during normal operation. field bit(s) initial value description rx fifo packet count 12:0 0x0 the number of received packets currently in the rx fifo. reserved 31:13 0x0 reads as 0x0. should be written to 0x0 for future compatibility. field bit(s) initial value description fifo tail 12:0 0x600 1 1. the initial value equals pba.rxa times 128. transmit fifo head pointer reserved 31:13 0x0 reads as 0x0. should be written to 0x0 for future compatibility. field bit(s) initial value description fifo tail 12:0 0x600 1 1. the initial value equals pba.rxa times 128. transmit fifo tail pointer reserved 31:13 0x0 reads as 0x0. should be written to 0x0 for future compatibility.
82574 gbe controller?driver programing interface 402 this register?s address has been moved from where it was located in the previous devices. however, for backwards compatibility, this register can also be accessed at its alias offset of 0x08018. in addition, with th e 82574, the value in this register contains the offset of the transmit fifo head relative to the beginning of the entire pbm space. alternatively, with the previous devices, the value in this register contains the relative offset to the beginning of the transmit fifo space (within the pbm space). 10.2.12.9 transmit data fifo head saved register - tdfhs (0x03420; rw) this register stores a copy of the transmit data fifo head register if the internal register needs to be restored. this register is available for diagnostic purposes only, and should not be written during normal operation. 10.2.12.10 transmit data fifo tail sa ved register - tdfts (0x03428; rw) this register stores a copy of the receive data fifo tail register if the internal register needs to be restored. this register is available for diagnostic purposes only, and should not be written during normal operation. 10.2.12.11 transmit data fifo packet count - tdfpc (0x03430; rw) this register reflects the number of packets to be transmitted that are currently in the transmit fifo. this register is available for diagnostic purposes only, and should not be written during normal operation. 10.2.12.12 packet buffer memory - pbm (0x10000 - 0x17fff; rw) field bit(s) initial value description fifo head 12:0 0x600 1 1. the initial value equals pba.rxa times 128. a saved value of the transmit fifo head pointer. reserved 31:13 0x0 reads as 0x0. should be written to 0x0 for future compatibility. field bit(s) initial value description fifo tail 12:0 0x600 1 1. the initial value equals pba.rxa times 128. a saved value of the transmit fifo tail pointer. reserved 31:13 0x0 reads as 0x0. should be written to 0x0 for future compatibility. field bit(s) initial value description tx fifo packet count 12:0 0x0 the number of packets to be transmi tted that are currently in the tx fifo. reserved 31:13 0x0 reads as 0x0. should be written to 0x0 for future compatibility. field bit(s) initial value description fifo data 31:0 x packet buffer data
403 driver programing interface?82574 gbe controller all pbm (fifo) data is available to diagnostics. locations can be accessed as 32-bit or 64-bit words. the internal pbm is 40 kb in size. as mentioned in section 10.2.7.36 , software can configure the amount of pbm space that is used as the transmit fifo versus the receive fifo. the default is 16 kb of transmit fifo space and 16 kb of receive fifo space. regardless of the indivi dual fifo sizes that software configures, the rx fifo is located first in the memory mapped pbm space. so for the default fifo configuration, the rx fifo occupies of fsets 0x10000-0x13fff of the memory mapped space, while the tx fifo occupies offs ets 0x14000-0x17fff of the memory mapped space. 10.2.12.13 packet buffer size -pbs (0x01008; rw) this register sets the on-chip receive and transmit storage allocation size, the allocation value is read/write for the lower six bits. the division between transmit and receive is done according to the pba register. note: programming this register does not automatically re-load or initialize internal packet- buffer ram pointers. the software must reset both transmit and receive operation (using the global device reset ctrl.rst bit) after changing this register in order for it to take effect. the pbs register itself is not reset by asserting the global reset, but only is reset at initial hardware power on. note: programming this register should be aligne d with programming the pba register. if pba and pbs are not coordinated, hardwa re operation is not determined. field bit(s) initial value description pbs 15:0 0x0028 packet buffer size lower six bits declare the packet buffer size both for transmit and receive in 1 kb granularity. the upper 10 bits are read as zero. the default is 40 kb. rsvd 31:16 0x0000 reserved read as zero.
82574 gbe controller?diagnostics 404 11.0 diagnostics to assist in test and debug of the software device driver, a set of software-usable features have been provided in the compon ent. these features include controls for specific test-mode usage, as well as some registers for verifying the 82574?s internal state against what the software device driver is expecting. 11.1 introduction the 82574 provides software visibility (and controllability) into certain major internal data structures, including all of the transmit and receive fifo space. however, interlocks are not provided for any operations, so diagnostic accesses can only be performed under very controlled circumstances. the 82574 also provides software-controllable support for certain loopback modes, to enable a software device driver to test tr ansmit and receive flows to itself. loopback modes can also be used to diagnose communication problems and attempt to isolate the location of a break in the communications path. 11.2 fifo pointer accessibility the 82574?s internal pointers into its transmit and receive data fifos are visible through the head and tail diagnost ic data fifo registers. see section 10.2.12 . diagnostics software can read these fifo pointers to confirm an expected hardware state following a sequence of operation(s). diagnostic software can further write to these pointers as a partial-step to verify expected fifo contents following a specific operation, or to subsequently write data directly to the data fifos. 11.3 fifo data accessibility the 82574?s internal transmit and receive data fifos contents are directly readable and writeable through the pbm register. the specific locations read or written are determined by the values of the fifo poin ters, which can be read and written. when accessing the actual fifo data structures, locations must be accessed as 32-bit words. see section 10.2.12 .
405 diagnostics?82574 gbe controller 11.4 loopback operations loopback operations are supported by the 82574 to assist with system and device debug. loopback operation can be used to test transmit and receive aspects of software device drivers, as well as to verify electrical integrity of the connections between the 82574 and the system (such as, pcie bus connections, etc.). loopback operation is supported as follows: note: configuration for loopback operation varies depending on the link configuration being used. ? mac loopback while operating with the internal phy ? loopback ? to configure for loopback operation, the rctl.lbm should remain configured as for normal operation (s et=00b). the phy must be programmed, using mdio accesses to its mii management registers, to perform loopback within the phy. note: all loopback modes are only allowed when the 82574 is configured for full-duplex operation. note: mac loopback is not functional when the mac is configured to work at 10 mb/s.
82574 gbe controller?electrical specifications 406 12.0 electrical specifications 12.1 introduction this chapter describes the 82574's electrical properties. 12.2 voltage regulator powe r supply specification 12.2.1 3.3 v dc rail 12.2.2 1.9 v dc rail title description min max units rise time time from 10% to 90% mark 1 100 ms mononotonicity voltage dip allowed in ramp 0 mv dc slope ramp rate at any given time between 10% and 90% 2880 v dc/s operational range voltage range for normal operating conditions 3 3.6 v dc ripple maximum voltage ripple @ bw = 50 mhz 70 mv overshoot maximum voltage allowed 4 v dc capacitance minimum capacitance 25 ? f title description min max units rise time time from 10% to 90% mark 1 100 ms mononotonicity voltage dip allowed in ramp 0 mv dc slope ramp rate at any given time between 10% and 90% 1440 v dc/s operational range voltage range for normal operating conditions 1.8 2 v dc ripple maximum voltage ripple @ bw = 50 mhz 50 mv dc overshoot maximum voltage allowed 2.7 v dc output capacitance capacitance range when using pnp circuit 20 40 ? f input capacitance capacitance range when using pnp circuit 20 ? f capacitance esr equivalent series resistance of output capacitance 1 1. do not use tantalum capacitors. 5 100 m ? ictrl maximum output current rating to ctrl18 10 ma
407 electrical specifications?82574 gbe controller 12.2.3 1.05 v dc rail 12.2.4 pnp specifications title description min max units rise time time from 10% to 90% mark 1 100 ms mononotonicity voltage dip allowed in ramp 0 mv dc slope ramp rate at any given time between 10% and 90% 800 v dc/s operational range voltage range for normal operating conditions -5 +5 % ripple maximum voltage ripple @ bw = 50 mhz 50 mv dc overshoot maximum voltage allowed 1.5 v dc output capacitance capacitance range when using pnp circuit 20 40 ? f input capacitance capacitance range when using pnp circuit 20 ? f capacitance esr equivalent series resistance of output capacitance 1 1. do not use tantalum capacitors. 10 m ? ictrl maximum output current rating to ctrl10 10 ma table 82. external power supply specification title description min max units vcbo 20 v dc vceo 20 v dc ic(max) 1a ic(peak) 1.2 a ptot minimum total dissipated power @ 25 c ambient temperature 1.5 w hfe dc current gain @ vce=-10 v dc, ic=500 ma 85 hfe ac current gain @ ic=50ma vce=-10 v dc, f=20 mhz 2.5 cc collector capacitance @ vcb=-5v, f=1mhz 50 pf ft transition frequency @ ic=10ma, vce=-5 v dc, f=100 mhz 40 mhz recommended transistor bcp69
82574 gbe controller?electrical specifications 408 12.3 power sequencing for proper and safe operation, the powe r supplies must follow the following rule: vdd3p3 (3.3 v dc) ? avdd1p9 (1.9 v dc) ? vdd1p0 (1.05 v dc) this means that vdd3p3 must start ramping before avdd1p8 and vdd1p0, but vdd1p0 might reach its nominal operating ra nge before avdd1p8 and vdd3p3. basically, the higher voltages must be greater than or equal to the lower voltages. this is necessary to avoid low impedance paths through clamping diodes and to eliminate back-powering. the same requirements apply to the power-down sequence. internal power on reset must be low throughout the time that the power supplies are ramping. this guarantees that the mac and phy resets cleanly. while internal power on reset is low, reset to the phy is also asserted. after the power supplies are valid, internal power on reset must remain low for at least t clk125start to guarantee that the clk125 clock from the phy is running. 12.4 power-on reset ? power up sequence ? 3.3 v dc -> 1.9 v dc -> 1.05 v dc ? power down sequence 1.05 v dc -> 1.9 v dc->3.3 v dc table 83. power detection thresholds symbol parameter specifications units min typ max v1a high threshold for 3.3 v dc supply 1.35 1.7 2.0 v dc v2a low threshold for 3.3 v dc supply 1.35 1.6 1.9 v dc v1b high threshold for 1.05 v dc supply 0.6 0.7 0.75 v dc v2b low threshold for 1.05 v dc supply 0.35 0.45 0.6 v dc
409 electrical specifications?82574 gbe controller 12.5 power scheme solutions figure 62 shows the intended design options for power solutions. the values for the various components in figure 62 are listed in ta b l e 8 4 ; ta b l e 8 5 and ta b l e 8 6 list the power consumption values. figure 62. power scheme schematics 3.3 v dc c1 ctrl10 option b: external 1.05 v dc 1.9 v dc pnp transistor regulator option a: fully integrated 1.05 v dc regulator 1.9 v dc pnp transistor regulator x c3 c4 r1 r 3.3 v dc 3.3 v dc c1 ctrl10 vdd3p3 vdd3p3 avdd1p9 avdd1p9 vdd1p0 3.3 v dc 1 k ohm dis_reg10 1 k ohm dis_reg10 82574 82574 c2 vdd1p0 x 3.3 v dc c1 ctrl10 vdd3p3 avdd1p9 dis_reg10 82574 c2 1.05 v dc vdd1p0 x 1.9 v dc c4 option c: all external power supplies c3 c4 r1 r 3.3 v dc r 3.3 v dc c4 r2 c5 3.3 v dc 1 k ohm ctrl19 ctrl19 ctrl19 option d: fully integrated 1.05 v dc external 1.9 v dc regulator x 3.3 v dc c1 ctrl10 vdd3p3 avdd1p9 82574 ctrl19 c4 external 1.9 v dc regulator 3.3 v dc x vdd1p0 c2 dis_reg10 1 k ohm
82574 gbe controller?electrical specifications 410 table 84. parameters for power scheme options notes: 1. all capacitors are ceramic type. 2. 10 ? f capacitance can be 2 x 4.7 ? f. 3. 22 ? f can be 2 x 10 ? f or 4 x 4.7 ? f for 1.9 v dc bypass. 4. place 0.1 ? f capacitors near pins. 5. pnp must be placed 0.5-inch (10 mm) from the 82574. 6. vdd1p0 pins are connected together by a plane. note: the following numbers apply to device current and power and do not include power losses on external components. table 85. options b and c power consumption (external 1.05 v dc regulator) option a option b 1 option c option d c1 10 ? f 10 ? f 10 ? f 10 ? f c2 22 ? f + 0.1 ? f (multiple) 10 ? f 22 ? f + 0.1 ? f (multiple) 22 ? f + 0.1 ? f (multiple) c3 10 ? f 10 ? f c4 10 ? f +0.1 ? f (multiple near pins) 22 ? f + 0.1 ? f (multiple near pins) 10 ? f +0.1 ? f (multiple near pins) c5 10 ? f +0.1 ? f (multiple near pins) r1 0 ? 0 ? r2 0 ? r5 k ? 5 k ? 1. 1.05 v dc pnp uses 1.9 v dc from pnp. state mode 3.3 [ma] 1.9 [ma] 1.05 [ma] power [mw] s0 - maximum 1000base-t active, 90 c 5 266 195 727 s0 - typical 1000base-t active 4 261 184 702 1000base-t idle 4 217 108 539 100base-t active 4 116 60 296 100base-t idle 4 71 22 171 10base-t active 4 162 48 372 10base-t idle 4 70 11 157 cable disconnect 4 14 5 45 lan disable 4 13 2 40 sx d3 cold with wol 100 mb/s 4 71 22 171 d3 cold with wol 10 mb/s 4 70 11 157 d3 cold without wol 4 8 5 34
411 electrical specifications?82574 gbe controller table 86. options a and d power consumption (fully integrated 1.05 v dc regulator) state mode 3.3 [ma] 1.9 [ma] power [mw] s0 - maximum 1000base-t active, 90 c 5 471 911 s0 - typical 1000base-t active 4 455 878 1000base-t idle 4 331 642 100base-t active 4 178 351 100base-t idle 4 93 190 10base-t active 4 212 416 10base-t idle 4 81 167 cable disconnect 4 18 44 lan disable 4 12 36 sx d3 cold with wol 100 mb/s 4 92 188 d3 cold with wol 10 mb/s 4 81 167 d3 cold without wol 4 13 35
82574 gbe controller?electrical specifications 412 12.6 discrete/integrated magnetics specifications criteria condition values (min/max) voltage isolation at 50 to 60 hertz for 60 seconds 1500 vrms (min) for 60 seconds 2250 v dc (min) open circuit inductance (ocl) or ocl (alternate) with 8 ma dc bias at 25 ? c 400 ? h (min) with 8 ma dc bias at 0 ? c to 70 ? c 350 ? h (min) insertion loss 100 khz through 999 khz 1.0 mhz through 60 mhz 60.1 mhz through 80 mhz 80.1 mhz through 100 mhz 100.1 mhz through 125 mhz 1 db (max) 0.6 db (max) 0.8 db (max) 1.0 db (max) 2.4 db (max) return loss 1.0 mhz through 40 mhz 40.1 mhz through 100 mhz when reference impedance si 85 ? , 100 ? , and 115 ? . note that return loss values might vary with mdi trace lengths. the lan magnetics might need to be measured in the platform where it is used. 18 db (min) 12 to 20 * log (frequency in mhz / 80) db (min) crosstalk isolation discrete modules 1.0 mhz through 29.9 mhz 30 mhz through 250 mhz 250.1 mhz through 375 mhz -50.3+(8.8*(freq in mhz / 30)) db (max) -26-(16.8*(log(freq in mhz / 250)))) db (max) -26 db (max) crosstalk isolation integrated modules 1.0 mhz through 10 mhz 10.1 mhz through 100 mhz 100.1 mhz through 375 mhz -50.8+(8.8*(freq in mhz / 10)) db (max) -26-(16.8*(log(freq in mhz / 100)))) db (max) -26 db (max) diff to cmr 1.0 mhz through 29.9 mhz 30 mhz through 500 mhz -40.2+(5.3*((freq in mhz / 30)) db (max) -22-(14*(log((freq in mhz / 250)))) db (max) cm to cmr 1.0 mhz through 270 mhz 270.1 mhz through 300 mhz 300.1 mhz through 500 mhz -57+(38*((freq in mhz / 270)) db (max) -17-2*((300-(freq in mhz) / 30) db (max) -17 db (max)
413 electrical specifications?82574 gbe controller 12.7 oscillator/crystal specifications see figure 63 for recommended crystal placement and layout instructions. table 87. external crystal specifications parameter name symbol recommended value max/min range conditions frequency f o 25 [mhz] @25 [c] vibration mode fundamental frequency tolerance @25 c df/f o @25c 30 [ppm] @25 [c] te m p e ra t u r e to l e ra n c e d f / f o 30 [ppm] series resistance (esr) r s 50 [ ? ] max @25 [mhz] crystal load capacitance c load 18 [pf] shunt capacitance c o 6 [pf] max drive level d l 300 [ ? w] max aging df/f o 5 ppm per year 5 ppm per year max calibration mode parallel insulation resistance 500 [m ? ] min @ 100 v dc
82574 gbe controller?electrical specifications 414 table 88. clock oscillator specifications note: peak-to-peak voltage presented at the xtal1 input cannot exceed 1.9 v dc. figure 63. xtal timing diagram 12.8 i/o dc parameters this section specifies the timing and el ectrical parameters for the various i/o interfaces. parameter name symbol/parameter conditions min typ max unit frequency f o @25 [c] 25.0 mhz swing vp-p1 3 3.3 3.6 v frequency tolerance f/f o -20 to +70 50 [ppm] operating temperature t opr -20 to +70 [c] aging f/f o 5 ppm per year [ppm] coupling capacitor ccoupling 12 15 18 [pf] th_xtal_in xtal_in high time 13 20 ns tl_xtal_in xtal_in low time 13 20 ns tj_xtal_in xtal_in total jitter 200 1 1. broadband peak-to-peak = 200 ps, broadban d rms = 3 ps, 12 khz to 20 mhz rms = 1 ps. ps
415 electrical specifications?82574 gbe controller 12.8.1 test, jtag and nc-si 12.8.2 leds symbol/parameter conditions min typ max unit vdd3p3 3.0 3.3 3.6 v dc v il -0.65 0.8 v dc v ih 2.0 vdd3p3+0.4 v dc input leakage 0 82574 gbe controller?electrical specifications 416 12.8.3 smbus symbol/parameter conditions min typ max unit v il -0.4 0.9 v dc v ih 1.6 vdd3p3+0.4 v dc v oh 3.3 v dc v ol maximum @ i pullup 0.4 v dc i pullup 4ma i leak +/-10 ? a c i 10 pf v noise =0.3 v dc peak-to-peak t pad-in maximum @ c in =2 nand gate input loads 5ns t out_pad maximum @ c pad = 400 pf 100 ns t oeb_pad maximum @ c pad = 400 pf 100 ns
417 electrical specifications?82574 gbe controller note: this page intentionally left blank.
82574 gbe controller?design considerations 418 13.0 design considerations this section provides general design considerations and recommendations when selecting components and connecting special pins to the 82574. 13.1 pcie 13.1.1 port connection to the 82574 pcie is a dual simplex point-to-point serial differential low-voltage interconnect with a signaling bit rate of 2.5 gb/s per direction. the 82574?s pcie port consists of an integral group of transmitters and receiver s. the link between the pcie ports of two devices is a x1 lane that also consists of a transmitter and a receiver pair. note that each signal is 8b/10b encoded with an embedded clock. the pcie topology consists of a transmi tter (tx) located on one device connected through a differential pair connected to the receiver (rx) on a second device. the 82574 can be located on a motherboard or on an add-in card using a connector specified by pcie. the lane is ac-coupled between its corresponding transmitter and receiver. the ac- coupling capacitor is located on the board close to transmitter side. each end of the link is terminated on the die into nominal 100 ?? differential dc impedance. board termination is not required. for more information on pcie, refer to the pci express* base spec ification, revision 1.1 and pci express* card electromechanical specification, revision 1.1rd. for information about the 82574?s pcie power management capabilities, see section 5.0 . 13.1.2 pcie reference clock the 82574 uses a 100 mhz differential reference clock, denoted peclkp and peclkn. this signal is typically generated on the system board and routed to the pcie port. for add-in cards, the clock is furnished at the pcie connector. the frequency tolerance for the pcie reference clock is +/- 300 ppm. 13.1.3 other pcie signals the 82574 also implements other signals required by the pcie specification. the 82574 signals power management events to the system using the pe_wake_n signal, which operates very similarly to the familiar pci pme# signal. finally, there is a pe_rst_n signal, which serves as the familiar reset function for the 82574.
419 design considerations ?82574 gbe controller 13.1.4 pcie routing contact your intel representative for information regarding the pcie signal routing. 13.2 clock source all designs require a 25 mhz clock source . the 82574 uses the 25 mhz source to generate clocks up to 125 mhz and 1.25 ghz for the phy circuits. for optimum results with lowest cost, connect a 25 mhz parallel resonant crystal and appropriate load capacitors at the xtal1 and xtal2 leads. the frequency tolerance of the timing device should be 30 ppm or better. refer to the in tel? ethernet controllers timing device selection guide for more information on choosing crystals. for further information regarding the clock for the 82574, refer to the sections about frequency control, crystals, and oscillators that follow. 13.2.1 frequency control device design considerations this section provides information regarding frequency control devices, including crystals and oscillators, for use with all intel ethernet controllers. several suitable frequency control devices are available; none of which present any unusual challenges in selection. the concepts documented herein are applicable to other data communication circuits, including platform lan connect devices (phys). the intel ethernet controllers contain amplif iers, which when used with the specific external components, form the basis for feed back oscillators. these oscillator circuits, which are both economical and reliable, are described in more detail in section 13.3.1 . the intel ethernet controllers also have bus clock input functionality, however a discussion of this feature is beyond the scope of this document, and will not be addressed. the chosen frequency control device vendor should be consulted early in the design cycle. crystal and oscillator manufacturers familiar with networking equipment clock requirements may provide assistance in selecting an optimum, low-cost solution. 13.2.2 frequency control component types several types of third-party frequency refe rence components are currently marketed. a discussion of each follows, listed in preferred order. 13.2.2.1 quartz crystal quartz crystals are generally considered to be the mainstay of frequency control components due to their low cost and ease of implementation. they are available from numerous vendors in many package types and with various specification options. 13.2.2.2 fixed crystal oscillator a packaged fixed crystal oscillator comprises an inverter, a quartz crystal, and passive components conveniently packaged together . the device renders a strong, consistent square wave output. oscillators used with microprocessors are supplied in many configurations and tolerances. crystal oscillators should be restricted to use in special situations, such as shared clocking among devices or multiple controlle rs. as clock routing can be difficult to accomplish, it is preferable to provid e a separate crystal for each device.
82574 gbe controller?design considerations 420 13.2.2.3 programmable crystal oscillators a programmable oscillator can be configured to operate at many frequencies. the device contains a crystal frequency reference and a phase lock loop (pll) clock generator. the frequency multipliers and divisors are controlled by programmable fuses. a programmable oscillator?s accuracy depends heavily on the ethernet device?s differential transmit lines. the physical layer (phy) uses the clock input from the device to drive a differential manchester (for 10 mb/s operation), an mlt-3 (for 100 mbps operation) or a pam-5 (for 1000 mbps operation) encoded analog signal across the twisted pair cable. these signals are re ferred to as self-clocking, which means the clock must be recovered at the receiving link partner. clock recovery is performed with another pll that locks onto the signal at the other end. plls are prone to exhibit frequency jitter. the transmitted signal can also have considerable jitter even with the programmable oscillator working within its specified frequency tolerance. plls must be designed carefully to lock onto signals over a reasonable frequency range. if the transmitte d signal has high jitter and the receiver?s pll loses its lock, then bit errors or link loss can occur. phy devices are deployed for many differen t communication applications. some phys contain plls with marginal lock range and cannot tolerate the jitter inherent in data transmission clocked with a programmable os cillator. the american national standards institute (ansi) x3.263-1995 standard test method for transmit jitter is not stringent enough to predict pll-to-pll lock failure s, therefore, the use of programmable oscillators is not recommended. 13.2.2.4 ceramic resonator similar to a quartz crystal, a ceramic reso nator is a piezoelectric device. a ceramic resonator typically carries a frequency tolera nce of 0.5%, ? inadequate for use with intel ethernet controllers, and therefore, should not be utilized.
421 design considerations ?82574 gbe controller 13.3 crystal support 13.3.1 crystal selection parameters all crystals used with intel ethernet controllers are described as at-cut, which refers to the angle at which the unit is sliced with respect to the long axis of the quartz stone. ta b l e 8 9 lists crystals which have been used successfully in other designs (however, no particular product is recommended): for information about crystal selection parameters, see section 12.7 and ta b l e 8 7 . 13.3.1.1 vibrational mode crystals in the above-referenced frequenc y range are available in both fundamental and third overtone. unless there is a specia l need for third overtone, use fundamental mode crystals. at any given operating frequency, third overtone crystals are thicker and more rugged than fundamental mode crystals. third overto ne crystals are more suitable for use in military or harsh industrial environments. third overtone crystals require a trap circuit (extra capacitor and inductor) in the load circuitry to suppress fundamental mode oscillation as the circuit powers up. selecting values for these components is beyond the scope of this document. 13.3.1.2 nominal frequency intel ethernet controllers use a crystal freq uency of 25.000 mhz. the 25 mhz input is used to generate a 125 mhz transmit clock for 100base-tx and 1000base-tx operation ? 10 mhz and 20 mhz transmit clocks, for 10base-t operation. 13.3.1.3 frequency tolerance the frequency tolerance for an ethernet platform lan connect is dictated by the ieee 802.3 specification as 50 parts per million (p pm). this measurement is referenced to a standard temperature of 25 c. intel re commends a frequency tolerance of 30 ppm. 13.3.1.4 temperature stability an d environmental requirements temperature stability is a standard measure of how the oscillation frequency varies over the full operational temperature range (and beyond). several optional temperature ranges are currently available, including -40 c to +85 c for industrial environments. some vendors separate operating temperatures from temperature stability. manufacturers may also list temperature stability as 50 ppm in their data sheets. note: crystals also carry other specifications for storage temperature, shock resistance, and reflow solder conditions. crystal vendors should be consulted early in the design cycle to discuss the application and its environmental requirements. table 89. crystal manufacturers and part numbers manufacturer part no. kds america dsx321g ndk america inc. 41cd25.0f1303018 txc corporation - usa 7a25000165 9c25000008
82574 gbe controller?design considerations 422 13.3.1.5 calibration mode the terms series-resonant and parallel-resonant are often used to describe crystal oscillator circuits. specifying parallel mode is critical to determining how the crystal frequency is calibrated at the factory. a crystal specified and tested as series resonant oscillates without problem in a parallel-resonant circuit, but the frequency is higher than nominal by several hundred parts per million. the purpose of adding load capacitors to a crystal oscillator circuit is to establish resonance at a frequency higher than the crystal?s inherent series resonant frequency. figure 64 shows the recommended placement and layout of an internal oscillator circuit. note that pin x1 and x2 refers to xtal1 and xtal2 in the ethernet device, respectively. the crystal and the capacitors form a feedback element for the internal inverting amplifier. this combination is called parallel-resonant, because it has positive reactance at the selected frequency. in other words, the crystal behaves like an inductor in a parallel lc circuit. oscillator s with piezoelectric feedback elements are also known as ?pierce? oscillators. 13.3.1.6 load capacitance the formula for crystal load capacitance is as follows: where c1 = c2 = 27 pf and c stray = allowance for additional capacitance in pads, traces and the chip carrier within the ethernet device package an allowance of 3 pf to 7 pf accounts for lumped stray capacitance. the calculated load capacitance is 16 pf with an estimated stray capacitance of about 5 pf. individual stray capacitance components can be estimated and added. for example, surface mount pads for the load capacitors add approximately 2.5 pf in parallel to each capacitor. this technique is especially useful if y1, c1 and c2 must be placed farther than approximately one-half (0.5) inch from the device. it is worth noting that thin circuit boards generally have higher stray capacitance than thick circuit boards. consult the pcie design guide for more information. the oscillator frequency should be measured with a precision frequency counter where possible. the load specification or values of c1 and c2 should be fine tuned for the design. as the actual capacitance load increases, the oscillator frequency decreases. note: c1 and c2 may vary by as much as 5% (approximately 1 pf) from their nominal values. 13.3.1.7 shunt capacitance the shunt capacitance parameter is relatively unimportant compared to load capacitance. shunt capacitance represents the effect of the crystal?s mechanical holder and contacts. the shunt capacitance should equal a maximum of 6 pf. c l c1 c2 ? ?? c1 c2 + ?? ------------------ - c stray + =
423 design considerations ?82574 gbe controller 13.3.1.8 equivalent series resistance equivalent series resistance (esr) is the real component of the crystal?s impedance at the calibration frequency, which the invertin g amplifier?s loop gain must overcome. esr varies inversely with frequency for a given crystal family. the lower the esr, the faster the crystal starts up. use crystals with an esr value of 50 ? or better. 13.3.1.9 drive level drive level refers to power dissipation in use. the allowable drive level for a surface mounted technology (smt) crystal is less th an its through-hole counterpart, because surface mount crystals are typically made from narrow, rectangular at strips, rather than circular at quartz blanks. some crystal data sheets list crystals with a maximum drive level of 1 mw. however, intel ethernet controllers drive crystals to a level less than the suggested 0.3 mw value. this parameter does not have much value for on-chip oscillator use. 13.3.1.10 aging aging is a permanent change in frequency (and resistance) occurring over time. this parameter is most important in its first year because new crystals age faster than old crystals. use crystals with a maximum of 5 ppm per year aging. 13.3.1.11 reference crystal the normal tolerances of the discrete crystal components can contribute to small frequency offsets with respect to the target center frequency. to minimize the risk of tolerance-caused frequency offsets causing a small percentage of production line units to be outside of the acceptable frequency ra nge, it is important to account for those shifts while empirically determining the proper values for the discrete loading capacitors, c1 and c2. even with a perfect support circuit, most crysta ls will oscillate slightly higher or slightly lower than the exact center of the target frequency. therefore, frequency measurements (which determine the correct value for c1 and c2) should be performed with an ideal reference crystal. when the capacitive load is exactly equal to the crystal?s load rating, an ideal reference crystal will be perfectly centered at the desired target frequency. 13.3.1.11.1 reference crystal selection there are several methods available for ch oosing the appropriate reference crystal: ? if a saunders and associates (s&a) crystal network analyzer is available, then discrete crystal components can be tested until one is found with zero or nearly zero ppm deviation (with the appropriate capacitive load). a crystal with zero or near zero ppm deviation will be a good reference crystal to use in subsequent frequency tests to determine the best values for c1 and c2. ? if a crystal analyzer is not available, then the selection of a reference crystal can be done by measuring a statistically valid sample population of crystals, which has units from multiple lots and approved vend ors. the crystal, which has an oscillation frequency closest to the center of the distribution, should be the reference crystal used during testing to determine the best values for c1 and c2. ? it may also be possible to ask the approved crystal vendors or manufacturers to provide a reference crystal with zero or nearly zero deviation from the specified frequency when it has the specified cload capacitance.
82574 gbe controller?design considerations 424 when choosing a crystal, customers must k eep in mind that to comply with ieee specifications for 10/100 and 10/100/1000base-t ethernet lan, the transmitter reference frequency must be precise within ? 50 ppm. intel? recommends customers to use a transmitter reference frequency that is accurate to within ? 30 ppm to account for variations in crystal accuracy due to crystal manufacturing tolerance. 13.3.1.11.2 circuit board since the dielectric layers of the circuit board are allowed some reasonable variation in thickness, the stray capacitance from the printed board (to the crystal circuit) will also vary. if the thickness tolerance for the outer layers of dielectric are controlled within 17 percent of nominal, then the circuit board should not cause more than 2 pf variation to the stray capacitance at the crystal. when tuning crystal frequency, it is recommended that at least three circuit boards are tested for frequency. these boards should be from different production lots of bare circuit boards. alternatively, a larger sample population of circuit boards can be used. a larger population will increase the probability of ob taining the full range of possible variations in dielectric thickness and the full range of variation in stray capacitance. next, the exact same crystal and discrete load capacitors (c1 and c2) must be soldered onto each board, and the lan reference freq uency should be measured on each circuit board. the circuit board, which has a lan reference frequency closest to the center of the frequency distribution, should be used while performing the frequency measurements to select the appropriate value for c1 and c2. 13.3.1.11.3 temperature changes temperature changes can cause the crystal fr equency to shift. therefore, frequency measurements should be done in the final system chassis across the system?s rated operating temperature range. 13.3.2 crystal placement an d layout recommendations crystal clock sources should not be placed near i/o ports or board edges. radiation from these devices can be coupled into the i/o ports and radiate beyond the system chassis. crystals should also be kept away from the ethernet magnetics module to prevent interference. note: failure to follow these guidelines could result in the 25 mhz clock failing to start. when designing the layout for the crystal circuit, the following rules must be used: ? place load capacitors as close as po ssible (within design-for-manufacturability rules) to the crystal solder pads. they should be no more than 90 mils away from crystal pads. ? the two load capacitors, crystal component, the ethernet controller device, and the crystal circuit traces must all be located on the same side of the circuit board (maximum of one via-to-ground load capacitor on each xtal trace). ? use 27 pf (5% tolerance) 0402 load capacitors. ? place load capacitor solder pad directly in line with circuit trace (see figure 64 , point a). ? use 50 ?? impedance single-ended microstrip traces for the crystal circuit. ? route traces so that electro-magnetic fields from xtal2 do not couple onto xtal1. no differential traces.
425 design considerations ?82574 gbe controller ? route xtal1 and xtal2 traces to nearest inside corners of crystal pad (see figure 64 , point b). ? ensure that the traces from xtal1 and xtal2 are symmetrically routed and that their lengths are matched. ? the total trace length of xtal1 or xtal2 should be less than 750 mils. figure 64. recommended crystal placement and layout 13.4 oscillator support the 82574 clock input circuit is optimized for use with an external crystal. however, an oscillator can also be used in place of the crystal with the proper design considerations (see ta b l e 8 8 for detail clock oscillator specifications): ? the clock oscillator has an internal voltage regulator of 1.9 v dc to isolate it from the external noise of other circuits to mini mize jitter. if an external clock is used, this imposes a maximum input clock amplitude of 1.9 v dc. for example, if a 3.3 v dc oscillator is used, it's signal should be attenuated to a maximum of 1.9 v dc with a resistive divider circuit. ? the input capacitance introduced by the 82574 (approximately 20 pf) is greater than the capacitance specified by a typical oscillator (approximately 15 pf). ? the input clock jitter from the oscillator can impact the 82574 clock and its performance. note: the power consumption of additional circuitry equals about 1.5 mw. crystal pad crystal pad ethernet controller xtal1 xtal2 27pf 0402 27pf 0402 crystal ?b? ?b? ?a? 90 mils 90 mils capacitor capacitor less than 660 mils
82574 gbe controller?design considerations 426 ta b l e 9 0 lists oscillators that can be used with the 82574. please note that no particular oscillator is recommended): figure 65. oscillator solution 13.4.1 oscillator placement and layout recommendations oscillator clock sources should not be placed near i/o ports or board edges. radiation from these devices can be coupled into the i/o ports and radiate beyond the system chassis. oscillators should also be kept away from the ethernet magnetics module to prevent interference. 13.5 ethernet interface 13.5.1 magnetics for 1000 base-t magnetics for the 82574 can be either integrated or discrete. the magnetics module has a critical effect on overall ieee and emissions conformance. the device should meet the performance requ ired for a design with reasonable margin to allow for manufacturing variation. occasionally, components that meet basic specifications can cause the system to fail ieee testing because of interactions with other components or the printed circuit board itself. carefully qualifying new magnetics modules prevents this problem. when using discrete magnetics it is necessary to use bob smith termination: use four 75 ? resistors for cable-side center taps and unused pins. this method terminates pair- to-pair common mode impedance of the cat5 cable. table 90. oscillator manufa cturers and part numbers manufacturer part no. ndk america inc 2560tka-25m txc corporation - usa 6n25000160 or 7w25000025 citizen america corp csx750fjb25.000m-ut raltron electronics corp co4305-25.000-t-tr mtronpti m214tcn kyocera corporation kc5032c-c3 3.3 v dc c1 vdd3p3 1 k ohm out 1 k ohm 1000 pf clk oscillator 82574 xtal1
427 design considerations ?82574 gbe controller use an eft capacitor attached to the termin ation plane. suggested values are 1500 pf/ 2 kv or 1000 pf/3 kv. a minimum of 50-m il spacing from capacitor to traces and components should be maintained. 13.5.2 magnetics module qualification steps the steps involved in magnetics module qua lification are similar to those for crystal qualification: 1. verify that the vendor?s published specif ications in the component datasheet meet or exceed the specifications in section 12.6 . 2. independently measure the component?s electrical parameters on the test bench, checking samples from multiple lots. check that the measured behavior is consistent from sample to sample and that measurements meet the published specifications. 3. perform physical layer conformance testing and emc (fcc and en) testing in real systems. vary temperature and voltage while performing system level tests. 13.5.3 third-party magnetics manufacturers the following magnetics modules have been used successfully in previous designs. 13.5.4 layout considerations for the ethern et interface these sections provide recommendations for performing printed circuit board layouts. good layout practices are essential to meet ieee phy conformance specifications and emi regulatory requirements. critical signal traces should be kept as short as possible to decrease the likelihood of being affected by high frequency noise from other signals, including noise carried on power and ground planes. keeping the traces as short as possible can also reduce capacitive loading. since the transmission line medium extends onto the printed circuit board, special attention must be paid to layout and routing of the differential signal pairs. designing for 1000 base-t gigabit operation is very similar to designing for 10 and 100 mb/s. for the 82574, system level tests sh ould be performed at all three speeds. 13.5.4.1 guidelines for component placement component placement can affect signal quality, emissions, and component operating temperature this section provides guidelines for component placement. manufacturer part number low profile discrete: midcom inc. 000-7412-35r-lf1 standard discrete: belfuse pulse eng. s558-5999-p3 (12-core) h5007nl (12-core) integrated: foxconn pulse eng. amphenol belfuse tyco jfm38u1c-l1u1w jw0-0013nl rjmg2310 22830er c03-002 0862-1j1t-z4-f 6368472-1
82574 gbe controller?design considerations 428 careful component placement can: ? decrease potential problems directly relate d to electromagnetic interference (emi), which could cause failure to meet applicable government test specifications. ? simplify the task of routing traces. to some extent, component orientation will affect the complexity of trace routing. the overall objective is to minimize turns and crossovers between traces. minimizing the amount of space needed fo r the ethernet lan interface is important because other interfaces compete for phys ical space on a motherboard near the connector. the ethernet lan circuits need to be as close as possible to the connector. figure 66. general placement dist ances for 1000 base-t designs figure 66 shows some basic placement distance guidelines. figure 66 shows two differential pairs, but can be generalized fo r a gigabit system with four analog pairs. the ideal placement for the ethernet silicon would be approximately one inch behind the magnetics module. while it is generally a good idea to minimize lengths and distances, figure 66 also illustrates the need to keep the lan silicon away from the edge of the board and the magnetics module for best emi performance. 13.5.4.2 layout guidelines for use with integrated and discrete magnetics layout requirements are slightly di fferent when using discrete magnetics. these include: ? ground cut for hv installation (not required for integrated magnetics) ? a maximum of two (2) vias ?turns less than 45 ? discrete terminators lan silicon integrated rj-45 w/lan magnetics keep lan silicon 1" - 4" from lan connector. keep silicon traces at least 1" from edge of pb (2" is preferred). keep minimum distance between differential pairs more than seven times the dielectric thickness away from each other and other traces, including nvm traces and parallel digital traces. note: figure 66 represents a 10/100 diagram. use the same design considerations for the two differential pairs not shown for gigabit implementations.
429 design considerations ?82574 gbe controller figure 67 shows a reference layout for discrete magnetics. figure 67. layout for discrete magnetics 13.5.4.3 board stack-up recommendations printed circuit boards for these designs typically have four, six, eight, or more layers. although, the 82574 does not dictate the stack up, here is an example of a typical six- layer board stack up: ? layer 1 is a signal layer. it can contain the differential analog pairs from the ethernet device to the magnetics module, or to an optical transceiver. ? layer 2 is a signal ground layer. chassis ground may also be fabricated in layer 2 under the connector side of the magnetics module. ? layer 3 is used for power planes. ? layer 4 is a signal layer. ? layer 5 is an additional ground layer. ? layer 6 is a signal layer. for 1000 base-t (copper) gigabit designs, it is common to route two of the differential pairs (per port) on this layer. this board stack up configuration can be adjusted to conform to specific oem design rules. rj-45 82574l magnetics module
82574 gbe controller?design considerations 430 13.5.4.4 differential pair trace routing for 10/100/1000 designs trace routing considerations are important to minimize the effects of crosstalk and propagation delays on sections of the board where high-speed signals exist. signal traces should be kept as short as possible to decrease interference from other signals, including those propagated through power and ground planes. observe the following suggestions to help optimize board performance: ? maintain constant symmetry and spacing between the traces within a differential pair. ? minimize the difference in signal trace lengths of a differential pair. ? keep the total length of each differential pair under 4 inches. although possible, designs with differential traces longer than 5 inches are much more likely to have degraded receive ber (bit error rate ) performance, ieee phy conformance failures, and/or excessive emi (electromagnetic interference) radiation. ? keep differential pairs more than seven times the dielectric thickness away from each other and other traces, including nvm traces and parallel digital traces. ? keep maximum separation within differential pairs to 7 mils. ? for high-speed signals, the number of corners and vias should be kept to a minimum. if a 90 bend is required, it is recommended to use two 45 bends instead. refer to figure 68 . note: in manufacturing, vias are required for te sting and troubleshooting purposes. the via size should be a 17-mil (2 mils for manufac turing variance) finished hole size (fhs). ? traces should be routed away from board edges by a distance greater than the trace height above the reference plane. this allows the field around the trace to couple more easily to the ground plane rather than to adjacent wires or boards. ? do not route traces and vias under crystals or oscillators. this will prevent coupling to or from the clock. and as a general rule , place traces from clocks and drives at a minimum distance from apertures by a dist ance that is greater than the largest aperture dimension . figure 68. trace routing ? the reference plane for the differentia l pairs should be continuous and low impedance. it is recommended that the reference plane be either ground or 1.9 v dc (the voltage used by the phy). this provides an adequate return path for and high frequency noise currents. ? do not route differential pairs over splits in the associated reference plane as it may cause discontinuity in impedances. 45 45
431 design considerations ?82574 gbe controller 13.5.4.5 signal term ination and coupling the 82547l has internal termination on the mdi signals. external resistors are not needed. adding pads for external resistors can degrade signal integrity. 13.5.4.6 signal trace geometry for 1000 base-t designs the key factors in controlling trace emi radiation are the trace length and the ratio of trace-width to trace-height above the refere nce plane. to minimize trace inductance, high-speed signals and signal layers that are close to a reference or power plane should be as short and wide as practical. ideally, this trace width to height above the ground plane ratio is between 1:1 and 3:1. to mainta in trace impedance, the width of the trace should be modified when changing from one board layer to another if the two layers are not equidistant from the neighboring planes. each pair of signal should have a differential impedance of 100 ? . +/- 15%. if a particular tool cannot design differential traces, it is permissible to specify 55-65 ? single-ended traces as long as the spacing between the two traces is minimized. as an example, consider a differential trace pair on layer 1 that is 8 mils (0.2 mm) wide and 2 mils (0.05 mm) thick, with a spacing of 8 mils (0.2 mm). if the fiberglass layer is 8 mils (0.2 mm) thick with a dielectric constant, e r , of 4.7, the calculated single-ended impedance would be approximately 61 ? and the calculated differential impedance would be approximately 100 ? . when performing a board layout, do not allow the cad tool auto-router to route the differential pairs without intervention. in most cases, the differential pairs will have to be routed manually. note: measuring trace impedance for layout designs targeting 100 ? often results in lower actual impedance. designers should verify actual trace impedance and adjust the layout accordingly. if the actual impedance is consistently low, a target of 105 ? 110 ? should compensate for second order effects. it is necessary to compensate for trace-to-trace edge coupling, which can lower the differential impedance by up to 10 ? , when the traces within a pair are closer than 30 mils (edge to edge). 13.5.4.7 trace length and symmetry for 1000 base-t designs as indicated earlier, the overall length of differential pairs should be less than four inches measured from the ethernet device to the magnetics. the differential traces (within each pair) shou ld be equal in total length to within 50 mils (1.25 mm) and as symmetrical as possible. asymmetrical and unequal length traces in the differential pairs contribute to common mode noise. if a choice has to be made between matching lengths and fixing symmetry, more emphasis should be placed on fixing symmetry. common mode noise can degrade the receive circuit?s performance and contribute to radiated emissions. 13.5.4.7.1 signal detect each port of the 82574 has a signal detect pin for connection to optical transceivers. for designs without optical transceivers, thes e signals can be left unconnected because they have internal pull-up resistors. signal detect is not a high-speed signal and does not require special layout.
82574 gbe controller?design considerations 432 13.5.4.8 routing 1.9 v dc to the magnetics center tap the central-tap 1.9 v dc should be delivered as a solid supply plane (1.9 v dc) directly to the magnetic module or, if this is not po ssible, by a short and thick trace (lower than 0.2 ?? dc resistance). the decoupling capacitors for the central tap pins should be placed as close as possible to the magnetic component. this improves both emi and ieee compliance. 13.5.4.9 impedance discontinuities impedance discontinuities cause unwanted signal reflections. minimize vias (signal through holes) and other transmission line irregularities. if vias must be used, a reasonable budget is two per differential trace. unused pads and stub traces should also be avoided. 13.5.4.10 reducing circuit inductance traces should be routed over a continuous reference plane with no interruptions. if there are vacant areas on a reference or power plane, the signal conductors should not cross the vacant area. this causes impedance mismatches and associated radiated noise levels. noisy logic grounds should be separated from analog signal grounds to reduce coupling. noisy logic grounds can so metimes affect sensitive dc subsystems such as analog to digital conversion, operatio nal amplifiers, etc. all ground vias should be connected to every ground plane; and sim ilarly, every power via, to all power planes at equal potential. this helps reduce circui t inductance. another recommendation is to physically locate grounds to minimize the loop area between a signal path and its return path. rise and fall times should be as slow as possible. because signals with fast rise and fall times contain many high frequency harmonics, which can radiate significantly. the most sensitive signal returns closest to the chassis ground should be connected together. this will result in a smal ler loop area and reduce the likelihood of crosstalk. the effect of different configurations on the amount of crosstalk can be studied using electronics modeling software. 13.5.4.11 signal isolation to maintain best signal integrity, keep digital signals far away from the analog traces. a good rule of thumb is no digital signal sh ould be within 300 m ils (7.5 mm) of the differential pairs. if digital signals on other board layers cannot be separated by a ground plane, they should be routed perpendi cular to the differential pairs. if there is another lan controller on the board, take care to keep the differential pairs from that circuit away. some rules to follow for signal isolation: ? separate and group signals by function on separate layers if possible. keep a minimum distance between differential pair s more than seven times the dielectric thickness away from each other and other traces, including nvm traces and parallel digital traces. ? physically group together all components associated with one clock trace to reduce trace length and radiation. ? isolate i/o signals from high-speed signals to minimize crosstalk, which can increase emi emission and susceptibility to emi from other signals. ? avoid routing high-speed lan traces near other high-frequency signals associated with a video controller, cache controller, processor, or other similar devices.
433 design considerations ?82574 gbe controller 13.5.4.12 traces for decoupling capacitors traces between decoupling and i/o filter capacitors should be as short and wide as practical. long and thin traces are more in ductive and would reduce the intended effect of decoupling capacitors. also for similar reasons, traces to i/o signals and signal terminations should be as short as possible. vias to the decoupling capacitors should be sufficiently large in diameter to decrease series inductance. 13.5.4.13 light emitting diodes for designs based on the 82574 the 82574 provides three programmable high -current push-pull (active high) outputs to directly drive leds for link activity and speed indication. each lan device provides an independent set of led outputs; these pins and their function are bound to a specific lan device. each of the four led outputs can be individually configured to select the particular event, state, or activity, which is indicated on that output. in addition, each led can be individually configured for output polarity, as well as for blinking versus non-blinking (steady-state) indication. since the leds are likely to be integral to a magnetics module, take care to route the led traces away from potential sources of emi noise. in some cases, it may be desirable to attach filter capacitors. the led ports are fully programm able through the nvm interface. 13.5.5 physical layer conformance testing physical layer conformance testing (also kn own as ieee testing) is a fundamental capability for all companies with ethernet lan products. phy testing is the final determination that a layout has been performed successfully. if your company does not have the resources and equipment to perform these tests, consider contracting the tests to an outside facility. 13.5.5.1 conformance tests for 10/100/1000 mb/s designs crucial tests are as follows, listed in priority order: ? bit error rate (ber). good indicator of real world network performance. perform bit error rate testing with long and short cabl es and many link partners. the test limit is 10 -11 errors. ? output amplitude, rise and fall time (10/100 mb/s), symmetry and droop (1000mbps). for the 82575 controller, use the appropriate phy test waveform. ? return loss. indicator of proper impedance matching, measured through the rj-45 connector back toward the magnetics module. ? jitter test (10/100 mb/s) or unfiltered jitter test (1000 mb/s). indicator of clock recovery ability (master and slave for gigabit controller). 13.5.6 troubleshooting common physical layout issues the following is a list of common physical layer design and layout mistakes in lan on motherboard designs. 1. lack of symmetry between the two traces within a differential pair. asymmetry can create common-mode noise and distort the waveforms. for each component and/or via that one trace encounters, the other trace should encounter the same component or a via at the same distance from the ethernet silicon. 2. unequal length of the two traces within a differential pair. inequalities create common-mode noise and will distort the transmit or receive waveforms.
82574 gbe controller?design considerations 434 3. excessive distance between the ethernet silicon and the magnetics. long traces on fr4 fiberglass epoxy substrate will attenuate the analog signals. in addition, any impedance mismatch in the traces will be aggravated if they are longer than the four inch guideline. 4. routing any other trace parallel to and close to one of the differential traces. crosstalk getting onto the receive channel will cause degraded long cable ber. crosstalk getting onto the transmit channel can cause excessive emi emissions and can cause poor transmit ber on long cables. at a minimum, other signals should be kept 0.3 inches from the differential traces. 5. routing one pair of differential traces too close to another pair of differential traces. after exiting the ethernet silicon, the trace pairs should be kept 0.3 inches or more away from the other trace pairs. the only possible exceptions are in the vicinities where the traces enter or exit the magnetics, the rj-45 connector, and the ethernet silicon. 6. use of a low-quality magnetics module. 7. re-use of an out-of-date physical layer schematic in a ethernet silicon design. the terminations and decoupling can be different from one phy to another. 8. incorrect differential trace impedances. it is important to have ~100 ? impedance between the two traces within a differential pair. this becomes even more important as the differential traces be come longer. to calculate differential impedance, many impedance calculators on ly multiply the single-ended impedance by two. this does not take into account edge-to-edge capacitive coupling between the two traces. when the two traces within a differential pair are kept close to each other, the edge coupling can lower the effective differential impedance by 5 ? to 20 ? . short traces have fewer problems if the differential impedance is slightly off target. 13.6 smbus and nc-si smbus and nc-si are optional interfaces for pass-through and/or configuration traffic between the mc and the 82574. see section 3.4 and section 3.5 for more details. this section describes the hardware implementation requirements necessary to meet the nc-si physical layer standard. board-le vel design requirements are included for connecting the 82574 ethernet solution to an external mc. the layout and connectivity requirements are addressed in low-level deta il. this section, in conjunction with the network controller sideband interface (nc-si) specification version 1.0 rmii specification , also provides the complete board-level requirements for the nc-si solution. the 82574?s on-board system management bus (smbus) port enables network manageability implementations required for re mote control and alerting via the lan. with smbus, management packets can be routed to or from an mc. enhanced pass- through capabilities also enable system remo te control over standardized interfaces. also included is a new manageability interface, nc-si that supports the dmtf preos sideband protocol. an internal manageme nt interface called mdio enables the mac (and software) to monitor and control the phy.
435 design considerations ?82574 gbe controller 13.6.1 nc-si electrical interface requirements 13.6.1.1 external mc the external mc is required to meet the latest nc-si specification as it relates to the rmii electrical interface. 13.6.1.2 nc-si reference schematics figure 69 and shows the single-drop application connectivity requirements. figure 70 and shows the multi-drop application connectiv ity requirements. refer to the latest nc- si specification for any additional connectivity requirements. figure 69. nc-si connect ion requirements - single-drop configuration 82574 nc-si interface signals nc-si_clk_in nc-si_crs_dv nc-si_rxd_0 nc-si_rxd_1 nc-si_tx_en nc-si_txd_0 nc-si_txd_1 dmtf compliant bmc device ref_clk crs_dv rxd_0 rxd_1 tx_en txd_0 txd_1 50 mhz reference clock buffer 50 mhz 33 33 22 22 10k 10k 3.3v 10k 10k 10k 10k 10k
82574 gbe controller?design considerations 436 figure 70. nc-si connection requir ements - multi-dr op configuration 13.6.1.3 resets it is important to ensure that the resets for the mc and the 82574 are generated within a specific time interval. the important requir ement here is ensuring that the nc-si link is established within two seconds of the mc receiving the power good signal from the platform. both the 82574 and the external mc need to receive power good signals from the platform within one second of each other. this causes an internal power on reset with in the 82574 and then initialization as well as a triggering and initialization sequence for the mc. once these power good signals are received by both the 82574 and the ex ternal mc, the nc-si interface can be initialized. the nc-si specification calls out a requirement of link establishment within two seconds. the mc should poll this interf ace and establish a link for two seconds to ensure specification compliance. 82574 nc-si interface signals nc-si_clk_in nc-si_crs_dv nc-si_rxd_0 nc-si_rxd_1 nc-si_tx_en nc-si_txd_0 nc-si_txd_1 dmtf compliant bmc device ref_clk crs_dv rxd_0 rxd_1 tx_en txd_0 txd_1 50 mhz reference clock buffer 50 mhz 33 33 22 22 10k 10k 3.3v 10k 10k 10k 10k 10k 82574 nc-si interface signals nc-si_clk_in nc-si_crs_dv nc-si_rxd_0 nc-si_rxd_1 nc-si_tx_en nc-si_txd_0 nc-si_txd_1 33
437 design considerations ?82574 gbe controller 13.6.1.4 layout requirements 13.6.1.4.1 board impedance the nc-si signaling interface is a single-e nded signaling environment with a target board and trace impedance of 50 ?? plus 20% and minus 10% is recommended. this target impedance ensures optimal signal integrity and signal quality. 13.6.1.4.2 trace length restrictions intel recommends a trace length maximum value from a board placement and routing topology perspective of eight inches for direct connect applications ( figure 71 ). this ensures that signal integrity and quality is preserved from a design perspective and that compliance is met for the nc-si electrical requirements. figure 71. nc-si trace length requirement for direct connect for multi-drop applications ( figure 72 ) the spacing recommendation is a maximum of four inches. this keeps the overall length between the mc and the 82574 within the specification. 8 inches 82574 external mc nc-si_clk_in nc-si_txd(1:0) nc-si_rxd(1:0) nc-si_crs_dv nc-si_tx_en
82574 gbe controller?design considerations 438 figure 72. nc-si trace length requirement for multi-drop 8 inches 82574 external mc nc-si_clk_in nc-si_txd(1:0) nc-si_rxd(1:0) 82574 4 inches nc-si_crs_dv nc-si_tx_en . . . . .
439 design considerations ?82574 gbe controller 13.7 82574 power supplies the 82574 requires three power rails: 3.3 v dc, 1.9 v dc, and 1.05 v dc (see section 5.4 ). a central power supply can provide all the required voltage sources or the power can be derived from the 3.3 v dc supply and regulated locally using external regulators. if the lan wake capability is used, all voltages must remain present during system power down. local regulation of the lan voltages from system 3.3 vmain and 3.3 vaux voltages is recommended. refer to section 12.3 and section 12.5 for detailed information about power supply sequencing rules and intended design options for power solutions. external voltage regulators need to gene rate the proper voltage, supply current requirements (with adequate margin), an d provide the proper power sequencing. 13.7.1 82574 gbe controller power sequencing designs must comply with power sequenci ng requirements to avoid latch-up and forward-biased internal diodes (see figure 73 ). the general guideline for sequencing is: 1. power up the 3.3 v dc rail. 2. power up the 1.9 v dc next. 3. power up the 1.05 v dc rail last. for power down, there is no requirement (onl y charge that remains is stored in the decoupling capacitors). figure 73. power se quencing guideline 13.7.1.1 power up sequence (external lvr) the board designer controls the power up sequence with the following stipulations (see figure 74 ): ? 1.9 v dc must not exceed 3.3 v dc by more than 0.3 v dc. ? 1.05 v dc must not exceed 1.9 v dc by more than 0.3 v dc. ? 1.05 v dc must not exceed 3.3 v dc by more than 0.3 v dc. vdd3p3 avdd1p9 vdd1p0
82574 gbe controller?design considerations 440 figure 74. external lvr power-up sequence 13.7.1.2 power up-sequence (internal lvr) the 82574 controls the power-up sequence internally and automatically with the following conditions (see figure 75 ): ? 3.3 v dc must be the source for the internal lvr. ? 1.9 v dc never exceeds 3.3 v dc. ? 1.05 v dc never exceeds 3.3 v dc or 1.9 v dc. the ramp is delayed internally, with t delay depending on the rising slope of the 3.3 v dc ramp. figure 75. internal lvr power-up sequence vdd3p3 avdd1p9 vdd1p0 avdd1p9 vdd1p0 vdd3p3
441 design considerations ?82574 gbe controller 13.7.2 power and ground planes good grounding requires minimizing induct ance levels in the interconnections and keeping ground returns short, signal loop areas small, and power inputs bypassed to signal return, will significantly reduce emi radiation. the following guidelines help reduce circ uit inductance in both backplanes and motherboards: ? route traces over a continuous plane with no interruptions. do not route over a split power or ground plane. if there are vacant areas on a ground or power plane, avoid routing signals over the vacant area. this will increase inductance and emi radiation levels. ? separate noisy digital grounds from analog grounds to reduce coupling. noisy digital grounds may affect sensitive dc subsystems. ? all ground vias should be connected to every ground plane; and every power via should be connected to all power planes at equal potential. this helps reduce circuit inductance. ? physically locate grounds between a signal path and its return. this will minimize the loop area. ? avoid fast rise/fall times as much as possible. signals with fast rise and fall times contain many high frequency harmonics, which can radiate emi. ? the ground plane beneath a magnetics module should be split. the rj45 connector side of the transformer module should have chassis ground beneath it. ? power delivery traces should be a minimum of 100 mils wide at all places from the source to the destination. as power flows through pass transistors or regulators, the traces must be kept wide as well. the distribution of power is better done with a copper-pore under the phy. this provides low inductance connectivity to decoupling capacitors. decoupling capacitors should be placed as close as possible to the point of use and should avoid shar ing vias with other decoupling capacitors. decoupling capacitor placement control should be done for the phy as well as pass transistors or regulators. 13.8 device disable for a lom design, it might be desirable for the system to provide bios-setup capability for selectively enabling or disabling lom devices. this enables designers more control over system resource-management, avoid conf licts with add-in nic solutions, etc. the 82574 provides support for selectively enabling or disabling it. device disable is initiated by asserting the asynchronous dev_off_n pin. the dev_off_n pin has an internal pull-up resistor, so that it can be left not connected to enable device operation. the nvm?s device disable power down en bit enables device disable mode (hardware default is that the mode is disabled). while in device disable mode, the pcie link is in l3 state. the phy is in power down mode. output buffers are tri-stated. assertion or deassertion of pcie pe_rst_n does not have any effect while the 82574 is in device disable mode (that is, the 82574 stays in the respective mode as long as dev_off_n is asserted). however, the 82574 might momentarily exit the device disable mode from the time pcie pe_rst_n is de-asserted again and until the nvm is read.
82574 gbe controller?design considerations 442 during power-up, the dev_off_n pin is ignore d until the nvm is read. from that point, the 82574 might enter device disable if dev_off_n is asserted. note: the dev_off_n pin should maintain its stat e during system reset and system sleep states. it should also insure the proper default value on system power up. for example, a designer could use a gpio pin that defaults to 1b (enable) and is on system suspend power. for example, it maintains the state in s0-s5 acpi states). 13.8.1 bios handling of device disable assume that in the following power-up sequ ence the dev_off_n signal is driven high (or it is already disabled) 1. the pcie is established following the gio_pwr_good. 2. bios recognizes that the entire 82574 should be disabled. 3. the bios drives the dev_off_n signal to the low level. 4. as a result, the 82574 samples the dev_off_n signals and enters either the device disable mode. 5. the bios could put the link in the electrical idle state (at the other end of the pcie link) by clearing the link disable bit in the link control register. 6. bios might start with the device enumeration procedure (the entire 82574 functions are invisible). 7. proceed with normal operation 8. re-enable could be done by driving high the dev_off_n signal, followed later by bus enumeration. 13.9 82574 exposed pad* 13.9.1 introduction the 82574 is a 64-pin, 9 x 9 qfn package with an exposed-pad*. the exposed-pad* is a central pad on the bottom of the package that provides the primary heat removal path as well as electrical grounding for a printed circuit board (pcb). in order to maximize both the removal of heat from the package and the electrical performance, a landing pattern must be incorp orated on the pcb within the footprint of the package corresponding to the exposed metal pad or exposed heat slug on the package. the size of the landing pattern ca n be larger, smaller, or even take on a different shape than the exposed-pad* on th e package. however, the solderable area, as defined by the solder mask, should be at least the same size/shape as the exposed- pad* on the package to maximize the thermal/electrical performance. while the landing pattern on the pcb provid es a means of heat transfer/electrical grounding from the package to the board through a solder joint, thermal vias are necessary to effectively conduct from the surface of the pcb to the ground plane(s). the number of vias are application specif ic and dependent upon the package power dissipation as well as electrical conductivity requirements. as a result, thermal and electrical analysis and/or testing are re commended to determine the minimum number needed. warning: make sure that the 82574 has a good connection to ground. check for solder voids on the exposed pad,* solder wicking, or a complete lack of solder. failure to ensure a good connection to ground can result in functional failure.
443 design considerations ?82574 gbe controller the remainder of this section describes th e silkscreen/component pads, solder mask, solder paste, and two potential landing patterns that can be used for the 82574 package. note that these potential landing pa tterns have been used successfully in past designs, however no particular landing pattern is recommended. please work with your manufacturer and assembler to ensure a process that is reliable. 13.9.2 component pad, solder mask and solder paste figure 76 , figure 77 , and figure 78 show the silkscreen/components pad, solder mask and solder paste area for the 82574 package. figure 76. 82574 silkscreen an d components pad (top view) figure 77. 82574 solder mask
82574 gbe controller?design considerations 444 figure 78. 82574 solder paste the stencil for the solder paste should be 5 mils thick. also, use a solder paste alloy consisting of 96.5sn/3ag/0.5 cu for a lead free process. 13.9.3 landing pattern a (no via in pad) this landing pattern (vias outside expose d pad*) provides an extended ground connection, adequate solder coverage and less solder voiding; however, it does not provide thermal relief. this landing pattern also meets intel?s recommendation for coverage >= 80%. figure 79. 82574 landing pattern a (top view - vias on the outside of the exposed pad*) use 12 vias distributed on four sides (three per side, as shown in figure 79 ) or three sides (four per side). additional vias can be added to improve conductivity. if larger vias can be used (14 to 20 mil finished hole size), then a minimum of 9 vias can be evenly placed around the extended ground connection. 0.12 in. 0.30 mm 0.12 in. 0.30 mm 0.054 in. (1.38 mm) square x 9 metal pattern solder mask opening extended ground connection without thermal relief
445 design considerations ?82574 gbe controller 13.9.4 landing pattern b (therm al relief; no via in pad) this landing pattern (vias outside exposed pad*) provides thermal relief, adequate solder coverage, and less solder voiding; ho wever, it does not provide an extended ground connection. this landing pattern also meets intel?s recommendation for coverage >= 80%. figure 80. 82574 landing pattern b (top vi ew - vias on the outside of the exposed pad*) intel recommends using 16 vias evenly placed (as shown in figure 80 ) around the extended ground connection. additional vias can be added to improve conductivity. a minimum of 12 larger vias (14 to 20 mil finished hole size) can also be used. 32 mil via pad 10 mil finished hole (small via) 14 to 20 mil finished hole (large thermal via) 44 mil anti-pad thermal relief 8-spoke pattern 40 mil mimimum
82574 gbe controller?design considerations 446 13.10 xor testing note: bsdl files are not available for the 82574 family. a common board or system-level manufacturing test for proper electrical continuity between the 82574 and the board is some type of cascaded-xor or nand tree test. the 82574 implements an xor tree spanning most i/o signals. the component xor tree consists of a series of cascaded xor logic gates, each stage feeding in the electrical value from a unique pin. the output of the final stage of the tree is visible on an output pin from the component. figure 81. xor tree concept by connecting to a set of test-points or bed-of-nails fixture, a manufacturing test fixture can test connectivity to each of the component pins included in the tree by sequentially testing each pin, testing each pin when driven both high and low, and observing the output of the tree for th e expected signal value and/or change. note: some of the pins that are inputs for the xor test are listed as ?may be left disconnected? in the pin descriptions. if xo r test is used, all inputs to the xor tree must be connected. when the xor tree test is selected, the following behaviors occur: ? output drivers for the pins listed as ?tested? are all placed in high-impedance (tri- state) state to ensure that board/system test fixture can drive the tested inputs without contention. ? internal pull-up and pull-down devices for pins listed as ?tested? are also disabled to further ensure no contention with the board/system test fixture. ? the xor tree is output on the led1 pin. to enter the xor tree mode, a specific jtag pa ttern must be sent to the test interface. this pattern is described by the following tdf pattern: (dh = drive high, dl = drive low) dh (test_en, jtag_tdi) dl(jtag_tck,jtag_tms); dh(jtag_tck); dl(jtag_tck); dh(jtag_tms); loop 2 dh(jtag_tck); dl(jtag_tck); end loop dl(jtag_tms); loop 2 dh(jtag_tck); dl(jtag_tck); end loop
447 design considerations ?82574 gbe controller dl(jtag_tdi); dh(jtag_tck); dl(jtag_tck); dh(jtag_tdi); dh(jtag_tck); dl(jtag_tck); dl(jtag_tdi); dh(jtag_tck); dl(jtag_tck); dh(jtag_tdi); dh(jtag_tck); dl(jtag_tck); dl(jtag_tdi); dh(jtag_tck); dl(jtag_tck); dh(jtag_tdi) dh(jtag_tms); dh(jtag_tck); dl(jtag_tck); dl(jtag_tms); dh(jtag_tck); dl(jtag_tck); dh(jtag_tms); dh(jtag_tck); dl(jtag_tck); dh(jtag_tck); dl(jtag_tck); dl(jtag_tms); dh(jtag_tck); dl(jtag_tck); hold(jtag_tms,test_en,jtag_tck,jtag_tdi); note: xor tree reads left-to-right top-to-bottom. table 91. tested pins includ ed in xor tree (17 pins) pin name pin name pin name led2 smb_dat smb_alrt_n smb_clk nc_si_txd1 nc_si_txd0 nc_si_rxd1 nc_si_rxd0 nc_si_crs_dv nc_si_clk_in nvm_si nc_si_tx_en nvm_sk nvm_so nvm_cs_n led0 led1 (output of the xor tree)
82574 gbe controller?thermal design considerations 448 14.0 thermal design considerations 14.1 introduction this section describes the 82574 thermal characteristics and suggested thermal solutions. use this section to properly design a thermal solution for systems implementing the 82574. properly designed solutions provide adeq uate cooling to maintain the 82574 case temperature (tcase) at or below those listed in ta b l e 9 3 . ideally, this is accomplished by providing a low, local ambient temperature and creating a minimal thermal resistance to that local ambient temperature. heat sinks might be required if case temperatures exceed those listed in ta b l e 9 3 . by maintaining the 82574 case temperature at or below those recommended in this section, the 82574 will function properly and reliably. 14.2 intended audience the intended audience for this section is system design engineers using the 82574. system designers are required to address component and system-level thermal challenges as the market continues to adopt products with higher-speeds and port densities. new designs might be required to provide better cooling solutions for silicon devices depending on the type of syst em and target operating environment. 14.3 measuring the thermal conditions this section provides a method for determ ining the operating temperature of the 82574 in a specific system based on case temperature. case temperature is a function of the local ambient and internal temperatures of the component. this section specifies a maximum allowable tcase for the 82574. note: removal of the shield lid is required to measure the case temperature. 14.4 thermal considerations component temperature in a system environment is a function of the component, board, and system thermal characteristics. the board/system-level thermal constraints consist of the following: ? local ambient temperature near the component ? airflow over the component and surrounding board ? physical constraints at, above, and surro unding the component that might limit the size of a thermal enhancement
449 thermal design considerations?82574 gbe controller ? the component die temperature depends on the following: ? component power dissipation ?size ? packaging materials (effective thermal conductivity) ? type of interconnection to the substrate and motherboard ? presence of a thermal cooling solution ? thermal conductivity ? power density of the substrate/package, nearby components, and circuit board that is attached to it technology trends continue to push thes e parameters toward increased performance levels (higher operating speeds), i/o dens ity (smaller packages), and silicon density (more transistors). power density increases and thermal cooling solution space and airflow become more constrained as operating frequencies increase and packaging sizes decrease. these issues result in an increased emphasis on the following: ? package and thermal enhancement technology to remove heat from the device. ? system design to reduce local ambient temperatures and ensure that thermal design requirements are met for each component in the system. 14.5 packaging terminology the following is a list of packaging terminology used in this section: ? quad flat no leads - plastic encapsulated package with a copper leadframe substrate. package uses perimeter lands on the bottom of the package to provide electrical contact to the pcb. this package is also known as qfn. ? junction - refers to a p-n junction on the silicon. in this section, it is used as a temperature reference point (for example, theta ja refers to the junction to ambient temperature). ? ambient - refers to local ambient temp erature of the bulk air approaching the component. it can be measured by placin g a thermocouple approximately one inch upstream from the component edge. ? lands - the pads on the pcb that the bga balls are soldered to. ? pcb - printed circuit board. ? printed circuit assembly (pca) - an assembled pcb. ? thermal design power (tdp) - the estimated maximum possible/expected power generated in a component by a realistic application. use the maximum power requirement numbers from ta b l e 9 2 . ? lfm - linear feet per minute (airflow) 14.6 product package th ermal specification table 92. package thermal characteristics in standard jedec environment package type est. power (tdp) ? ja ? jt tj max 9 mm-64 qfn 473 mw 39.5 c/w 0.7 c/w 120 c
82574 gbe controller?thermal design considerations 450 the thermal parameters listed in ta b l e 9 2 are based on simulated results of packages assembled on a 4-layer 30 x 56 mm mini pcie board connected to a system board in a natural convection environment. the maximum case temperature is based on the maximum junction temperature and defined by the relationship, tcase-max = tjmax - (jt x power) where jt is the junction-to-package top thermal characterization parameter. if the case temperature exceeds the specified tcase max, thermal enhancements such as heat sinks or forced air are required. ja is the package junction- to-air thermal resistance. note: thermal models are available upon request (flotherm 2-resistor, delphi or detailed format). 14.7 thermal specifications to ensure proper operation and reliability of the 82574, the thermal solution must maintain a case temperature at or below the values specified in ta b l e 9 3 . system-level or component-level thermal enhancements are required to dissipate the generated heat if the case temperature exceeds th e maximum temperatures listed in ta b l e 9 3 . good system airflow is critical to dissipate the highest possible thermal power. the size and number of fans, vents, and/or ducts, and, their placement in relation to components and airflow channels within the system determine airflow. acoustic noise constraints might limit the size and types of fans, vents and ducts that can be used in a particular design. to develop a reliable, cost-effective thermal solution, all of the system variables must be considered. use system-level thermal characteristics and simulations to account for individual component thermal requirements. table 93. 82574 preliminary th ermal absolute maximum rating 14.7.1 case temperature the 82574 is designed to operate properly as long as the tcase is not exceeded. section 14.12 describes the proper guidelines for measuring case temperature. 14.7.2 designing for thermal performance section 14.14 describes the pcb and system design recommendations required to achieve the required 82574 thermal performance. parameter maximum tca s e 1 109 c 1. tcase is defined as the maximum case temperatur e without any thermal enhancement to the package.
451 thermal design considerations?82574 gbe controller 14.8 thermal attributes 14.8.1 typical system definitions the following system example is used to ge nerate thermal characteristics data. note that the evaluation board is a four-layer 30 x 56 mm mpcie board. ? all data is preliminary and is not validated against physical samples. specific system designs might be significantly different. ? a larger board size with more than four copper layers might increase the 82574 thermal performance. figure 82. 82574 test setup note: the mpcie board is connected to the bottom side of the system board.
82574 gbe controller?thermal design considerations 452 14.9 82574 package ther mal characteristics table 94. expected tcase (c) at tdp figure 83. maximum allowable am bient temperature vs. air flow 14.10 reliability each pca, system, and heat sink combination varies in attach strength and long-term adhesive performance. carefully evaluate the reliability of the completed assembly prior to high-volume use. some reliability recommendations are listed in ta b l e 9 5 . airflow (lfm) ambient temperature (c) 0 100 200 300 400 85 103 101 99 98 97 75 93 91 89 88 87 70 88 86 84 83 82 65 83 81 79 78 77 55 73 71 69 68 67 45 63 61 59 58 57 35 53 51 49 48 47 0 1816141312
453 thermal design considerations?82574 gbe controller table 95. reliability validation 14.11 measurements for thermal specifications determining the thermal properties of the system requires careful case temperature measurements. guidelines for measuring 82574 case temperature are provided in section 14.12 . 14.12 case temperature measurements maintain 82574 tcase at or below the maximum case temperatures listed in ta b l e 9 3 to ensure functionality and reliability. special care is required when measuring the case temperature to ensure an accurate temperature measurement. use the following guidelines when making case measurements: ? measure the surface temperature of the ca se in the geometric center of the case top. ? calibrate the thermocouples used to measure tcase before making temperature measurements. ? use 36-gauge (maximum) k-type thermocouples. care must be taken to avoid introducing errors into the measurements when measuring a surface temperature that is a different temperature from the surrounding local ambient air. measurement errors might be due to a poor thermal contact between the thermocouple junction and the surface of the package, heat loss by radiation, convection, conduction through thermocoup le leads, and/or contact between the thermocouple cement and the heat-sink base (if used). test 1 1. performed the above tests on a sample size of at least 12 assemblies from three lots of material (total = 36 assemblies). requirement pass/fail criteria 2 2. additional pass/fail criteria can be added as necessary. mechanical shock 50 g, board level 11 ms, 2 shocks/axis visual and electrical check random vibration 7.3 g, board level 45 minutes/axis, 50 to 2000 hz visual and electrical check high-temperature life +85 c 2000 hours total checkpoints occur at 168, 500, 1000, and 2000 hours visual and mechanical check thermal cycling per-target environment (for example, -40 c to +85 c) 500 cycles visual and mechanical check humidity 85% relative humidity 85 c, 1000 hours visual and mechanical check
82574 gbe controller?thermal design considerations 454 14.12.1 attaching the thermocouple the following approach is recommended to minimize measurement errors for attaching the thermocouple to the case. ? use 36 gauge or smaller diameter k type thermocouples. ? ensure that the thermocouple has been properly calibrated. ? attach the thermocouple bead or junction to the top surface of the package (case) in the center of the package using high thermal conductivity cements. note: it is critical that the entire thermocouple lead be butted tightly to the top of the package. ? attach the thermocouple at a 0 angle if there is no interference with the thermocouple attach location or leads ( figure 84 ). this is the preferred method and is recommended for use with non-enhanced packages. figure 84. technique for measuring tcase with a 0 angle attachment 14.13 conclusion increasingly complex systems require better power dissipation. care must be taken to ensure that the additional power is properly dissipated. heat can be dissipated using improved system cooling, selective use of duct ing, passive or active heat sinks, or any combination. the simplest and most cost effective method is to improve the inherent system cooling characteristics through careful design and placement of fans, vents, and ducts. when additional cooling is required, thermal enhancements may be implemented in conjunction with enhanced system cooling. the size of the fan or heat sink can be varied to balance size and space constraints with acoustic noise. this section has presented the conditions and requirements to properly design a cooling solution for systems implementing the 82574. properly designed solutions provide adequate cooling to maintain the 82574 case temperature at or below those listed in ta b l e 9 3 . ideally, this is accomplished by providing a low local ambient temperature and creating a minimal thermal resistance to that local ambient temperature. alternatively, heat sinks might be required if case temperatures exceed those listed in ta b l e 9 3 . by maintaining the 82574 case temperature at or below those recommended in this section, the 82574 will func tion properly and reliably. use this section to understand the 82574 ther mal characteristics and compare them to your system environment. measure the 82574 case temperatures to determine the best thermal solution for your design.
455 thermal design considerations?82574 gbe controller 14.14 pcb guidelines the following general pcb design guidelines are recommended to maximize the thermal performance of qfn packages: 1. when connecting ground (thermal) vias-to the ground planes, do not use thermal- relief patterns. 2. thermal-relief patterns are designed to limi t heat transfer between the vias and the copper planes, thus constricting the heat flow path from the component to the ground planes in the pcb. 3. as board temperature also has an effect on the thermal performance of the package, avoid placing 82574 adjacent to high power dissipation devices. 4. if airflow exists, locate the components in the mainstream of the airflow path for maximum thermal performance. avoid placing the components downstream, behind larger devices or devices with heat sinks that obstruct the air flow or supply excessively heated air. note: the previously mentioned guidelines are not all inclusive and are defined to give known, good design practices to maxi mize the thermal performance of the components.
82574 gbe controller?board layout and schematic checklists 456 15.0 board layout and schematic checklists table 96. board layout checklist section check item remarks general obtain the most recent documentation and specification updates. documents are subject to frequent change. route the transmit and receive differential traces before routing the digital traces. layout of differential traces is critical. placement of the 82574 place the 82574 at least one inch from the edge of the board. with closer spacing, fields can follow the surface of the magnetics module or wrap past edge of the board. as a result, emi might increase. the optimum location is approximately one inch behind the magnetics module. place the 82574 at least one inch from the integrated magnetics module but less than four inches. keep trace length under four inches from the 82574 through the magnetics to the rj-45 connector. signal attenuation can cause problems for traces longer than four inches. however, due to near field emi, the 82574 should be placed at least one inch away from the magnetics module. pcie interface place the ac coupling capacitors on the pci express* (pcie*) tx traces as close as possible to the 82574 but not further than 250 mils. size 0402, x7r is recommended. the ac coupling capacitors should be placed near the transmitter for pcie. place the ac coupling capacitors on the pcie rx traces as close as possible to the upstream pcie device but not further than 250 mils. size 0402, x7r is recommended. the ac coupling capacitors should be placed near the transmitter for pcie. make sure the trace impedance for the pcie differential pairs is 100 ? +/- 20%. these traces should be routed differentially. match trace lengths within each pcie pair on a segment-by-segment basis. match trace lengths within a pair to five mils. clock source (crystal option) place crystal within 0.75 inches of the 82574. this reduces emi. place the crystal load capacitors within 0.09 inches of the crystal. keep clock lines away from other digital traces (especially reset signals), i/o ports, board edge, transformers and differential pairs. this reduces emi.
457 board layout and schematic ch ecklists?82574 gbe controller section check item remarks clock source (oscillator option) ensure the oscillator has a it's own local power supply decoupling capacitor. if the oscillator is shar ed or is more than two inches away from the 82574, a back- termination resistor sh ould be placed near the oscillator for each 82574. this enables tuning to ensure that reflections do not distort the clock waveform. keep clock lines away from other digital traces (especially reset signals), i/o ports, board edge, transformers and differential pairs. this reduces emi. eeprom or flash memory the nvm can be placed a few inches away from the 82574 to provide better spacing of critical components. 10/100/ 1000base-t interface tra c es design traces for 100 ? differential impedance ( 20%). primary requirement for 10/100/1 000 mb/s ethernet. paired 50 ? traces do not make 100 ? differential. an impedance calculator can be used to verify this. avoid highly resistive traces (for example, avoid four mil traces longer than four inches). if trace length is a problem, use thicker board dielectrics to allow wider traces. thicker copp er is even better than wider traces. if a lan switch is used or the trace length from the 82574 is greater than four inches. it might be necessary to boost the voltage at the center tap with a separate power supply to optimize mdi performance. consider using a second 82574 instead of a lan switch and long mdi traces. it is difficult to achieve excellent performance with long traces and analog lan switches. additional optimization effort is required to tune the system, the center tap voltage, and magnetics modules. make traces symmetrical. pairs should be matched at pads, vias and turns. asymmetry contributes to impedance mismatch. do not make 90 bends. bevel corners with turns based on 45 angles avoid through holes (vias). if vias are used, the budget is two per trace. keep traces close together inside a differential pair. traces should be kept within 10 mils regardless of trace geometry. keep trace-to-trace length difference within each pair to less than 50 mils. this minimizes signal skew and common mode noise. improves long cable performance. pair-to-pair trace length does not have to be matched as differences are not critical. the difference between the length of longest pair and the length of the shortest pair should be kept below two inches. keep differential pairs more than seven times the dielectric thickness away from each other and other traces, including nvm traces and parallel digital traces. this minimizes crosstalk and nois e injection. tigh ter spacing is allowed for the first 200 mils of trace near of the components. ensure that line side mdi traces and line side termination are at least 80 mils from all other traces. this is to ensure the system can survive a high voltage on the mdi cable. (hi-pot) keep traces at least 0.1 inches away from the board edge. this reduces emi. do not have stubs along the traces. stubs cause discontinuities that impact return loss. digital signals on adjacent layers must cross at 90 angles. splits in power and ground planes must not cross. differential pairs should be run on different layers as needed to improve routing.
82574 gbe controller?board layout and schematic checklists 458 section check item remarks nc-si design traces for 50 ? single ended impedance (+ 20% - 10%). there should be less than eight inches of trace between the 82574 and the manageability controller (mc). there should be less than 30 pf total trace capacitance. there should be less than four inches of trace between the 82574 and any other devices sharing the nc-si bus. 10/100/ 1000base-t interface magnetics module capacitors connected to center taps should be placed very close (less than 0.1 inch recommended) to the integrated magnetics module. this improves bit error rate (ber). the system side center tap on the transformer should be connected to the 1.9 v dc power supply through a plane. the center tap voltage is critical to performance of mdi interface. any voltage drop can cause violations to the specification. some designs that have a resistive path to the mdi transformer may require addition regulators to boost the voltage to above 1.9 v dc at the transformer center tap. 10/100/ 1000base-t interface chassis ground provide a separate chassis ground ?island? to ground the shroud of the rj-45 connector and if needed to terminate the line side of the magnetics module. this design improves emi behavior. the split in ground plane should be at least 50 mils. for discrete magnetics modules, the split should run under center of magnetics module. differential pairs never cross the split. ensure there is a gap to provide high voltage isolation to line side of the mdi traces and the bob smith termination. the bob smith termination and the mdi traces should be >= 80 mils away from all components and traces on the same layer. ensure there is at least 10 mils of single ply woven epoxy (fr-4) between the chassis ground and any other nodes. since there can be small air pockets between woven fibers, it better to use thicker, two ply, or three ply epoxy (fr-4) to provide high voltage isolation. place 4-6 pairs of pads for stitching capacitors to bridge the gap from chassis ground to signal ground. determine exact number and values empirically based on emi performance. power supply and signal ground when using the intern al regulator control circuits of the 82574 with external pnp transistors, keep the trace length from the ctrl10 and ctrl19 output balls to the transistors very short (less one inch) and use 50 mil (minimum) wide traces. a low inductive lo op should be kept from the regulator control pin, through the pnp transistor, and back to the chip from the transistor's collector output. the power pins should connect to the collector of the transistor through a power plane to reduce the inductive path. this reduce s oscillation and ripple in the power supply. use planes if possible. narrow finger-like planes and very wide traces are allowed. if traces are used, 100 mi ls is the minimum. the 1.05 v dc and 1.9 v dc regulating circuits require 1/2 inch x 1/2 inch thermal relief pads for each pnp. the pads should be placed on the top layer, under the pnp. the 3.3 v dc rail should have at least 25 ? f of capacitance. the 1.05 v dc and 1.9 v dc rails should have 20-40 ? f of capacitance. place these to minimize the inductance from each power pin to the nearest decoupling capacitor. place decoupling and bulk capacitors close to 82574, with some along every side, using short, wide traces and large vias. if power is distributed on traces, bulk capacitors should be used at both ends. if power is distributed on cards, bulk capacitors should be used at the connector. if using decoupling capacitors on led lines, place them carefully. capacitors on led lines should be placed near the leds. led circuits keep led traces away from sources of noise, for example, high speed digital traces running in parallel. led traces can carry noise into integrated magnetics modules, rj-45 connectors, or out to the edge of the board, increasing emi.
459 board layout and schematic ch ecklists?82574 gbe controller table 97. schematic checklist section check items remarks general obtain the most recent documentation and specification updates. documents are subject to frequent change. observe instructions fo r special pins needing pull-up or pull- down resistors. pcie interface connect pcie interface pins to corresponding pins on an upstream pcie device. place ac coupling capacitors (0.1 ? f) near the pcie transmitter. size 0402, x7r is recommended. connect peclkn and peclkp to 100 mhz pcie system clock. this is required by the pcie interface. connect pe_rst_n to pltrst# on an upstream pcie device. this is required for prop er device initialization. connect pe_wake_n to pe_wake# on an upstream pcie device. this is required to enable wake on lan functionality required for advanced power management. support pins connect pin 28 dev_off_n to super_io_gp_disable# or a pull-up with a 1 k ? resistor. connect to a super i/o pin that retains its value during pcie reset, is driven from the resume well and defaults to one on power-up. if device off functionality is not needed, then dev_off_n should be connected with an external pull- up resistor. ensure pull-ups are connected to aux power. pull-down pin 48, rset, with a 4.99 k ? 1% resistor. this is required by the pcie and mdi interfaces. pull-up pin 39, aux_pwr, with a 1 k ? resistor if the power supplies are derived from always on auxiliary power rails. this pin impacts operation if the 82574 advertises d3 cold wakeup support on the pcie bus. ensure pull-ups are connected to auxiliary power. pull-down pin 29, test_en, with a 1 k ? resistor. this is required to prevent the device from going into test mode during normal operation. this pin must be driven high during the xor test. clock source (oscillator option) use 25 mhz 50 ppm oscillator. the oscillator needs to ma intain 50 ppm under all applicable temperature and voltage conditions. avoid pll clock buffers. clock buffe rs introduce additional jitter. broadband peak-to-peak jitter must be less than 200 ps. use a local decoupling capacitor on the oscillator power supply. the signal from the o scillator must be ac coupled into the 82574. the 82574 has internal circuitry to set the input common mode voltage. the clock signal going into the 82574 should have an amplitude between 1.2 v dc and 1.9 v dc. this can be achieved with a resistive divider network.
82574 gbe controller?board layout and schematic checklists 460 section check items remarks clock source (crystal option) use 25 mhz 30 ppm accuracy @ 25 c crystal. avoid components that introduce jitter. parallel resonant crystals are required. the calibration load should be 18 pf. sp ecify equivalent series resistance (esr) to be 50 ? or less. connect two load capacitors to crystal; one on xtal1 and one on xtal2. use 27 pf capacitors as a starting point, but be prepared to change the value based on testing. capacitance affects accuracy of the frequency. must be matched to crystal specific ations, includ ing estimated trace capacitance in calculation. use capacitors with low esr (types c0g or npo, for example). refer to the design considerations section of the datasheet and the intel ethernet controllers timing device selection guide for more information. nvm use 0.1 ? f decoupling capacitor. applies to eeprom or flash devices. if spi flash is used, connect pin 38 (nvmt) to ground through a 1 k ? resistor. if an spi eeprom is used, connect pin 38 (nvmt) to 3.3 v dc through a 1 k ? resistor. ensure pull-ups are connect ed to auxiliary power. the nvm must be powered from auxiliary power. the nvm is read when the system is powered on even before main power is available. check connections to nvm_cs_n, nvm_sk, nvm_si, nvm_so. pins on the 82574 are connected to same named pins on the nvm. (nvm_si connects to si on nvm. nvm_so connects to so on nvm.) smbus for best performance, each 82574 should have it's own dedicated smbus link to the smbus master device. the 82574 allows for multiple devices on a smbus link; however, the smbus has a very limited throughput. using multiple devices fu rther limits throughput. the 82574 has errata with respect to smbus arp when multiple slave devices are used. using only a single device per bus avoids these errata. if smbus is not used, connect pull-up resistors to smb_clk, smb_dat, and smb_alrt_n. 10 k ? pull-ups are reasonable values. ensure pull-ups are connected to auxiliary po wer. this prevents noise on these pins from causing problems with device operation. if smbus is used, there should be pull-up resistors on smb_dat, smb_alrt_n and smb_clk somewhere on the board. smbus signals are open-drain. ensure pull-ups are connected to auxiliary power.
461 board layout and schematic ch ecklists?82574 gbe controller section check items remarks nc-si use 10 k ? pull-up resistors on the nc_si_txd0, nc_si_txd1, nc_si_rxd0, and nc_si_rxd1 interfaces. ensure pull-ups are connected to auxiliary power. refer to the design considerations section of the datasheet for more details. use a 10 k ? pull-down resistors on the nc_si_tx_en, and nc_si_crs_dv interfaces. refer to the design considerations section of the datasheet for more details. use a 33 ? series resister on the nc_si_clk_in interface near the clock source. this improves signal integrit y by preventing reflections. the value might need to be tuned for a specific design. use a 22 ? series back-termination resistor near the manageability controller (mc) nc_si_txd0 and nc_si_txd1 interface. this improves reflections on the trace. the value might need to be tuned for a specific design. if the nc-si interface is not used tie nc_si_clk_in, nc_si_crs_dv, and nc_si_tx_en each to ground using a 10 k ? resistor. this is required so that noise on these pins does not cause problems with device operation. if the nc-si interface is not used tie nc_si_txd0, nc_si_txd1, nc_si_rxd0, and nc_si_rxd1 each to 3.3 v dc using a 10 k ? resistor this is required so that noise on these pins does not cause problems with device operation. 10/100/ 1000base-t interface tra c es design traces for 100 ? differential impedance ( 20%) primary requirement for 10/100/1000 mb/s ethernet. paired 50 ? traces do not make 100 ? differential. an impedance calculator can be used to verify this. avoid highly resistive traces (for example, avoid four mil traces longer than four inches) if trace length is a pr oblem, use thicker board dielectrics to allow wider traces. thicker copper is even better than wider traces. if a lan switch is used or the trace length from the 82574 is greater than four inches. it might be necessary to boost the voltage at the center tap with a separate power supply to optimize mdi performance. the boosted center tap voltage is between 1.9 v dc and 2.65 v dc and consume up to 200 ma. consider using a second 82574 instead of a lan switch and long mdi traces. it is di fficult to achieve excellent performance with long traces and analog lan switches. an optimization effort is required to tune the system, the center tap voltage, and magnetics modules. 10/100/1000 base-t interface magnetic module (integrated option) qualify magnetic module s carefully for return loss, insertion loss, op en circuit inductance, common mode rejection, and crosstalk isolation. a magnetics module is crit ical to passing ieee phy conformance tests and emi test. supply 1.9 v dc to the transformer center taps and use 0.01 ? f bypass capacitors. if a lan switch is used or the trace length from the 82574 is greater than four inches, it might be necessary to boost the voltage at the center tap with a separate external power supply to optimize mdi performance. 1.9 v dc at the center tap biases the 82574's output buffers. capacitors with low esr should be used. ensure there are no termination resistors in the path between the 82574 and the magnetic module. the 82574 has an internal termination network.
82574 gbe controller?board layout and schematic checklists 462 section check items remarks 10/100/ 1000base-t interface magnetics module (discrete option with rj-45 connector) bob smith termination: use 4 x 75 ? resistors connected to each cable-side center tap. terminate pair-to-pair common mode impedance of the cat5 cable. bob smith termination: use an eft capacitor attached to the chassis ground. suggested values are 1500 pf/2 kv or 1000 pf/3 kv. these capacitors provide high voltage isolation. supply 1.9 v dc to the system side transformer center taps and use 0.01 ? f bypass capacitors. if a lan switch is used or the trace length from the 82574 is greater than four inches. it might be necessary to boost the voltage at the center tap with a separate power supply to optimize mdi performance. 1.9 v dc at the center tap biases the 82574's output buffers. capacitors with low esr should be used. ensure there is high vo ltage isolat ion to line side of the mdi traces and the bob smith termination. the bob smith termination and the mdi traces should be >= 80 mils away from all components and traces on the same layer. do not use less than 10 mils of single ply woven epoxy (fr-4). there can be small air pockets between woven fibers. use thicker, two ply, or three ply epoxy (fr-4). ensure there are no termination resistors in the path between the 82574 and the magnetics. the 82574 has an internal termination network. 10/100/ 1000base-t interface chassis ground provide a separate chassis ground to connect the shroud of the rj-45 connector and to terminate the line side of the magnetic module. this design improves emi behavior. place pads for approximately 4-6 stitching capacitors to bridge the gap from chassis ground to signal ground. typical values range from 0.1 ? f to 4.7 ? f. the correct value should be determined experimentally to improve emi. past experiments have shown they are not required in some designs.
463 board layout and schematic ch ecklists?82574 gbe controller section check items remarks integrated power supply (option a and b) provide a 3.3 v dc supply. use an auxiliary power supply. auxiliary power is necessary to support wake up from power down states. connect external pnp transistor's base to ctrl19 and the emitter to the 3.3 v dc supply. the collector supplies 1.9 v dc. the connections and transistor parameters are critical. connect external pnp transistor's base to ctrl10 and the emitter to the 3.3 v dc supply. the collector supplies 1.05 v dc. the connections and transistor parameters are critical. for option b only. connect a 5 k ? resistor from ctrl19 to the 3.3 v dc supply. connect a 5 k ? resistor from ctrl10 to the 3.3 v dc supply. for option b only. for option a: connect dis_reg10 to ground. for option b: connect dis_reg10 to the 3.3 v dc supply. enable internal 1.05 v dc regulator if it is used. ensure that there is at least 10 ? f of capacitance at the emitters of the pnps. the 3.3 v dc rail should have at least 25 ? f of capacitance. the 1.05 v dc and 1.9 v dc rails should have 20-40 ? f of capacitance. place these to minimize the inductance from each power pin to the nearest decoupling capacitor. place decoupling and bulk capacitors close to 82574, with some along every side, using short, wide traces and large vias. if power is distributed on traces, bulk capacitors should be used at both ends. if power is distributed on cards, bulk capacitors should be used at the connector.
82574 gbe controller?board layout and schematic checklists 464 section check items remarks external power supply (option c) derive all three power supplies from auxiliary power supplies. auxiliary power is necessary to support wake up from power down states. if the 1.05 v dc and 1.9 v dc rails are externally supplied, ensure that ctrl10 and ctrl19 are tied to ground through a 3.3 k ? resistor. alternatively, they could be left floating. pull-down resistors do not need to be exactly 3.3 k ? ; however, they must be greater than 1 k ? . connect dis_reg10 to the 3.3 v dc supply with a 1 k ? resister. disable internal 1.05 v dc regulator. it is recommended that the 1.9 v dc supply be tunable with a resistor option. tuning the 1.9 v dc supply might be required to optimize mdi performance. the 3.3 v dc rail should have at least 25 ? f of capacitance. the 1.05 v dc and 1.9 v dc rails should have at least 20 ? f of capacitance. place these to minimize the inductance from each power pin to the nearest decoupling capacitor. place decoupling and bulk capacitors close to 82574, with some along every side, using short, wide traces and large vias. if power is distributed on traces, bulk capacitors should be used at both ends. if power is distributed on cards, bulk capacitors should be used at the connector. all voltages should ramp to within their control bands in 100 ms or less. voltages must ramp in sequence (3.3 v dc ramps first, 1.9 v dc ramps second, 1.05 v dc ramps last). the voltage rise must be monotonic. the minimum rise time on the 3.3 v dc power is 1 ms. the 82574 has a power on reset circuit that requires a 1-100 ms ramp time. the rise must be montonic to so the power on reset triggers only once. the sequence is required protect the esd diodes connected to the power supplies from being forward biased integrated power supply (option d) provide a 3.3 v dc and 1.9 v dc supply. derive power supplies from au xiliary power supplies. auxiliary power is necessary to support wake up from power down states. ensure that ctrl10 and ctrl19 are tied to ground through a 3.3 k ? resistor. alternatively, they could be left floating. pull-down resistors do not need to be exactly 3.3 k ? ; however, they must be greater than 1 k ? . connect dis_reg10 to ground. enabl e internal 1.05 v dc regulator. the 3.3 v dc rail should have at least 25 ? f of capacitance. the 1.05 v dc and 1.9 v dc rails should have 20- 40 ? f of capacitance. place these to minimize the inductance from each power pin to the nearest decoupling capacitor. place decoupling and bulk capacitors close to 82574, with some along every side, using short, wide traces and large vias. if power is distributed on traces, bulk capacitors should be used at both ends. if power is distributed on cards, bulk capacitors should be used at the connector.
465 board layout and schematic ch ecklists?82574 gbe controller section check items remarks led circuits basic recommendation is a single green led for activity and a dual (bi-color) led for speed. many other configurations are possible. leds are configurable through the nvm. two led configurations are co mpatible with integrated magnetic modules. for the link/activity led, connect the cathode to the led1 pin and the anode to vcc. for the bi-color speed led pair, have the led2 signal drive one end. the other end should be connected to led0. when led2 is low, the orange led is lit. when led0 is low, the green led is lit. connect leds to 3.3 v dc as indicated in reference schematics. use 3.3 v dc aux for designs supporting wake-up. consider adding one or two filtering capacitors per led for extremely noisy situations. suggested starting value is 470 pf. add current limiting resistors to led paths. typical current limiting resistors are 250 ? to 330 ? when using a 3.3 v dc supply. current limiting resistors are sometimes included with integrated magnetic modules. mfg test the 82574 allows a jtag test access port to enable an xor tree test. because of pin sharing the 82574 cannot be used in a jtag chain. the jtag pins must be individually driven and sampled.
82574 gbe controller?models 466 16.0 models contact your intel representative for a ccess to the 82574 ibis and hspice models.
467 models?82574 gbe controller note: this page intentionally left blank.
82574 gbe controller?reference schematics 468 17.0 reference schematics contact your intel representative for access to the 82574 reference schematics.


▲Up To Search▲   

 
Price & Availability of WG82574L

All Rights Reserved © IC-ON-LINE 2003 - 2022  

[Add Bookmark] [Contact Us] [Link exchange] [Privacy policy]
Mirror Sites :  [www.datasheet.hk]   [www.maxim4u.com]  [www.ic-on-line.cn] [www.ic-on-line.com] [www.ic-on-line.net] [www.alldatasheet.com.cn] [www.gdcy.com]  [www.gdcy.net]


 . . . . .
  We use cookies to deliver the best possible web experience and assist with our advertising efforts. By continuing to use this site, you consent to the use of cookies. For more information on cookies, please take a look at our Privacy Policy. X